diff --git a/.gitea/workflows/build.yml b/.gitea/workflows/build.yml new file mode 100644 index 0000000..e9b8de1 --- /dev/null +++ b/.gitea/workflows/build.yml @@ -0,0 +1,30 @@ +name: Build + +on: [push] + +jobs: + build: + runs-on: ubuntu-gitea + + steps: + + - name: Prep For Local Builds + run: echo "${LOCIP} gitea.comnenos" >> /etc/hosts + + - uses: actions/checkout@v4 + + - name: Install Build Tools + run: | + apt update + apt -y --no-install-recommends install build-essential gcc make + + - name: Build + run: make + + - name: Verify Binaries + run: | + echo "=== Built binaries ===" + ls -la build/ + echo "" + echo "=== Version check (cnhelp) ===" + ./build/cnhelp diff --git a/INTERNALS.md b/INTERNALS.md new file mode 100644 index 0000000..31d0c62 --- /dev/null +++ b/INTERNALS.md @@ -0,0 +1,784 @@ +# cnotes Internals Guide + +A detailed walkthrough of the cnotes codebase, focusing on C90 file handling, memory management, and cross-platform techniques. + +## Table of Contents + +1. [Project Structure](#project-structure) +2. [Platform Abstraction (platform.h)](#platform-abstraction) +3. [Configuration (config.h)](#configuration) +4. [File I/O Patterns](#file-io-patterns) +5. [Module Walkthrough](#module-walkthrough) + - [cnadd.c - Writing to Files](#cnadd---writing-to-files) + - [cndump.c - Reading and Parsing](#cndump---reading-and-parsing) + - [cnfind.c - Searching](#cnfind---searching) + - [cncount.c - Aggregation](#cncount---aggregation) + - [cndel.c - File Rewriting](#cndel---file-rewriting) +6. [Memory Management](#memory-management) +7. [String Handling in C90](#string-handling-in-c90) +8. [Cross-Platform Considerations](#cross-platform-considerations) + +--- + +## Project Structure + +``` +cnotes/ +├── include/ +│ ├── config.h # Application configuration constants +│ └── platform.h # Platform-specific abstractions +├── src/ +│ ├── cnadd.c # Add new entries +│ ├── cndump.c # Display entries +│ ├── cnfind.c # Search entries +│ ├── cncount.c # Statistics +│ ├── cndel.c # Archive (delete) entries +│ └── cnhelp.c # Help system +├── Makefile # GCC build +├── MAKEFILE.TC # Turbo C++ 3.0 build +└── BUILD.BAT # DOS batch build +``` + +--- + +## Platform Abstraction + +**File: `include/platform.h`** + +This header provides a consistent interface across DOS, Windows, and Unix systems. + +### Key Concepts + +```c +#ifndef PLATFORM_H +#define PLATFORM_H +``` + +The **include guard** prevents multiple inclusion. If `PLATFORM_H` is already defined, the preprocessor skips the entire file. + +### Platform Detection + +```c +#if defined(__MSDOS__) || defined(__DOS__) + /* DOS-specific code */ +#elif defined(_WIN32) + /* Windows-specific code */ +#else + /* Unix/Linux/macOS code */ +#endif +``` + +Compilers pre-define macros that identify the target platform: +- `__MSDOS__`, `__DOS__` - DOS compilers (Turbo C, DJGPP) +- `_WIN32` - Windows compilers (MSVC, MinGW) +- Neither - Assumed to be Unix-like + +### Platform-Specific Definitions + +| Macro | DOS | Windows | Unix | +|-------|-----|---------|------| +| `PATH_SEPARATOR` | `'\\'` | `'\\'` | `'/'` | +| `PATH_SEP_STR` | `"\\"` | `"\\"` | `"/"` | +| `HOME_ENV` | `"CNOTES_HOME"` | `"USERPROFILE"` | `"HOME"` | +| `mkdir_portable(p)` | `mkdir(p)` | `_mkdir(p)` | `mkdir(p, 0755)` | + +**Why two path separator forms?** +- `PATH_SEPARATOR` (char) - For character comparisons +- `PATH_SEP_STR` (string) - For string concatenation with `sprintf()` + +### The mkdir Problem + +Different systems have different `mkdir()` signatures: + +```c +/* DOS (dir.h) */ +int mkdir(const char *path); + +/* Windows (direct.h) */ +int _mkdir(const char *path); + +/* Unix (sys/stat.h) */ +int mkdir(const char *path, mode_t mode); +``` + +The `mkdir_portable()` macro abstracts this difference. + +--- + +## Configuration + +**File: `include/config.h`** + +### Compile-Time Defaults + +```c +#ifndef CNOTES_FILE +#define CNOTES_FILE "cnotes.csv" +#endif +``` + +The `#ifndef` pattern allows override at compile time: + +```bash +gcc -DCNOTES_FILE=\"myfile.csv\" ... +``` + +### Memory Constraints + +```c +#ifndef MAX_ENTRIES + #ifdef MAX_ENTRIES_DEFAULT + #define MAX_ENTRIES MAX_ENTRIES_DEFAULT + #else + #define MAX_ENTRIES 5000 + #endif +#endif +``` + +DOS has limited memory (~640KB conventional). `MAX_ENTRIES_DEFAULT` is set to 100 for DOS in `platform.h`, but 5000 for modern systems. + +--- + +## File I/O Patterns + +### Opening Files + +C90 provides `fopen()` with mode strings: + +| Mode | Meaning | +|------|---------| +| `"r"` | Read (file must exist) | +| `"w"` | Write (creates/truncates) | +| `"a"` | Append (creates if needed) | +| `"r+"` | Read/write (file must exist) | +| `"w+"` | Read/write (creates/truncates) | + +**Always check for failure:** + +```c +FILE *fp = fopen(path, "r"); +if (fp == NULL) { + fprintf(stderr, "Error: Cannot open '%s'\n", path); + return 1; +} +``` + +### Reading Lines + +```c +char line[500]; +while (fgets(line, sizeof(line), fp) != NULL) { + /* Process line */ +} +``` + +`fgets()` is safe because it: +1. Takes a maximum length argument +2. Always null-terminates +3. Returns NULL on EOF or error + +**Never use `gets()`** - it has no length limit and is a buffer overflow vulnerability. + +### Writing Data + +```c +/* Formatted output */ +fprintf(fp, "%s,%s,%s,\"%s\"\n", date, time, category, message); + +/* Or build string first, then write */ +sprintf(buffer, "%s,%s\n", field1, field2); +fputs(buffer, fp); +``` + +### Closing Files + +```c +fclose(fp); +``` + +**Always close files** to: +1. Flush buffered data to disk +2. Release system resources +3. Allow other programs to access the file + +--- + +## Module Walkthrough + +### cnadd - Writing to Files + +**Purpose:** Append a new timestamped entry to the notes file. + +#### Getting the Current Time + +```c +#include + +time_t now; +struct tm *local; + +time(&now); /* Get seconds since epoch */ +local = localtime(&now); /* Convert to local time struct */ + +sprintf(date_str, "%04d-%02d-%02d", + local->tm_year + 1900, /* Years since 1900 */ + local->tm_mon + 1, /* Months are 0-11 */ + local->tm_mday); + +sprintf(time_str, "%02d:%02d", + local->tm_hour, + local->tm_min); +``` + +The `struct tm` fields: +- `tm_year` - Years since 1900 (so 2026 = 126) +- `tm_mon` - Month (0-11, so January = 0) +- `tm_mday` - Day of month (1-31) +- `tm_hour`, `tm_min`, `tm_sec` - Time components + +#### Building the File Path + +```c +int get_cnotes_path(char *buffer, size_t bufsize, const char *filename) { + const char *home = getenv(HOME_ENV); + + if (home == NULL) { + fprintf(stderr, "Error: %s not set\n", HOME_ENV); + return 0; + } + + /* Check buffer size before writing */ + if (strlen(home) + strlen(CNOTES_DIR) + strlen(filename) + 3 > bufsize) { + fprintf(stderr, "Error: Path too long\n"); + return 0; + } + + sprintf(buffer, "%s" PATH_SEP_STR "%s" PATH_SEP_STR "%s", + home, CNOTES_DIR, filename); + return 1; +} +``` + +**Key points:** +1. `getenv()` returns NULL if variable isn't set +2. Always check buffer size before `sprintf()` +3. `PATH_SEP_STR` is a string, so it concatenates directly + +#### Creating Directories + +```c +void ensure_directory_exists(const char *filepath) { + char dir[512]; + char *last_sep; + + strcpy(dir, filepath); + last_sep = strrchr(dir, PATH_SEPARATOR); + + if (last_sep != NULL) { + *last_sep = '\0'; /* Truncate at last separator */ + mkdir_portable(dir); + } +} +``` + +`strrchr()` finds the **last** occurrence of a character. By truncating there, we get the directory portion of the path. + +#### Appending to File + +```c +FILE *fp = fopen(path, "a"); /* "a" = append mode */ +if (fp == NULL) { + fprintf(stderr, "Error: Cannot open file\n"); + return 1; +} + +fprintf(fp, "%s,%s,%-*s,\"%s\"\n", + date_str, + time_str, + CATEGORY_LENGTH, category, /* Left-justified, padded */ + message); + +fclose(fp); +``` + +The format `%-*s`: +- `-` = left-justify +- `*` = width comes from next argument +- `s` = string + +So `%-*s, CATEGORY_LENGTH, category` prints `category` left-justified in a field of `CATEGORY_LENGTH` characters. + +--- + +### cndump - Reading and Parsing + +**Purpose:** Read all entries and display in a formatted table. + +#### The Entry Structure + +```c +typedef struct { + char date[DATE_LENGTH + 1]; /* +1 for null terminator */ + char time[TIME_LENGTH + 1]; + char category[CATEGORY_LENGTH + 1]; + char text[TXTMSG_LENGTH + 1]; +} Entry; +``` + +**Why +1?** C strings are null-terminated. A 10-character date needs 11 bytes: 10 for characters + 1 for `'\0'`. + +#### Dynamic Memory Allocation + +```c +Entry *entries = (Entry *)malloc(MAX_ENTRIES * sizeof(Entry)); +if (entries == NULL) { + fprintf(stderr, "Error: Cannot allocate memory\n"); + return 1; +} +/* ... use entries ... */ +free(entries); +``` + +**Why malloc instead of stack array?** + +```c +Entry entries[MAX_ENTRIES]; /* BAD on DOS - stack overflow! */ +``` + +DOS has ~64KB stack limit. With `MAX_ENTRIES=5000` and `Entry` being ~150 bytes, that's 750KB - stack overflow! `malloc()` uses the heap, which has more space. + +#### Parsing Fixed-Width Fields + +```c +static const char *parse_fixed_field(const char *ptr, char *dest, + int length, char delimiter) { + if ((int)strlen(ptr) < length) + return NULL; /* Not enough data */ + + strncpy(dest, ptr, length); + dest[length] = '\0'; /* Ensure null-terminated */ + + ptr += length; /* Advance pointer */ + + if (*ptr != delimiter) + return NULL; /* Expected delimiter not found */ + + return ptr + 1; /* Return pointer past delimiter */ +} +``` + +This function: +1. Copies exactly `length` characters to `dest` +2. Null-terminates the result +3. Verifies the expected delimiter follows +4. Returns a pointer to continue parsing, or NULL on error + +**Usage pattern (state machine):** + +```c +const char *ptr = line; +ptr = parse_fixed_field(ptr, entry->date, 10, ','); +if (!ptr) return 0; /* Parse error */ +ptr = parse_fixed_field(ptr, entry->time, 5, ','); +if (!ptr) return 0; +/* ... continue ... */ +``` + +#### Parsing Variable-Width Fields + +```c +static const char *parse_variable_field(const char *ptr, char *dest, + int max_length, char delimiter) { + int i = 0; + + while (*ptr != '\0' && *ptr != delimiter) { + if (i < max_length) { + dest[i++] = *ptr; + } + /* Continue even if truncating, to find delimiter */ + ptr++; + } + + dest[i] = '\0'; + + if (*ptr != delimiter) + return NULL; + + return ptr + 1; +} +``` + +This handles fields of unknown length up to a maximum, with graceful truncation. + +#### Sorting with qsort() + +```c +#include + +/* Comparison function signature required by qsort */ +static int compare_by_date(const void *a, const void *b) { + const Entry *entry_a = (const Entry *)a; + const Entry *entry_b = (const Entry *)b; + + int cmp = strcmp(entry_a->date, entry_b->date); + if (cmp != 0) return cmp; + + return strcmp(entry_a->time, entry_b->time); +} + +/* Usage */ +qsort(entries, entry_count, sizeof(Entry), compare_by_date); +``` + +`qsort()` parameters: +1. Array pointer +2. Number of elements +3. Size of each element +4. Comparison function pointer + +The comparison function must return: +- Negative if a < b +- Zero if a == b +- Positive if a > b + +**Why `const void *`?** C90's `qsort()` is generic - it works with any data type. You cast to your actual type inside the function. + +--- + +### cnfind - Searching + +**Purpose:** Find entries matching search criteria. + +#### Case-Insensitive Search + +```c +#include + +/* Convert character to lowercase */ +int to_lower(int c) { + if (c >= 'A' && c <= 'Z') { + return c + ('a' - 'A'); + } + return c; +} + +/* Case-insensitive substring search */ +char *strcasestr_portable(const char *haystack, const char *needle) { + size_t needle_len; + + if (*needle == '\0') + return (char *)haystack; + + needle_len = strlen(needle); + + while (*haystack != '\0') { + /* Check if needle matches at current position */ + size_t i; + int match = 1; + + for (i = 0; i < needle_len && haystack[i] != '\0'; i++) { + if (to_lower(haystack[i]) != to_lower(needle[i])) { + match = 0; + break; + } + } + + if (match && i == needle_len) + return (char *)haystack; + + haystack++; + } + + return NULL; +} +``` + +**Why implement our own?** `strcasestr()` is not part of C90 - it's a POSIX/GNU extension. + +#### Multiple Filter Criteria + +```c +int matches = 1; /* Assume match until proven otherwise */ + +/* Filter by category */ +if (filter_category[0] != '\0') { + if (strcasecmp_portable(entry->category, filter_category) != 0) { + matches = 0; + } +} + +/* Filter by date */ +if (matches && filter_date[0] != '\0') { + if (strcmp(entry->date, filter_date) != 0) { + matches = 0; + } +} + +/* Filter by text pattern */ +if (matches && pattern[0] != '\0') { + if (strcasestr_portable(entry->text, pattern) == NULL) { + matches = 0; + } +} + +if (matches) { + /* Entry passes all filters */ +} +``` + +This "whittle down" approach applies filters incrementally. + +--- + +### cncount - Aggregation + +**Purpose:** Count entries, optionally grouped by category or date. + +#### Tracking Unique Values + +```c +typedef struct { + char key[32]; + int count; +} CountEntry; + +CountEntry counts[MAX_CATEGORIES]; +int num_categories = 0; + +void increment_count(const char *key) { + int i; + + /* Look for existing key */ + for (i = 0; i < num_categories; i++) { + if (strcmp(counts[i].key, key) == 0) { + counts[i].count++; + return; + } + } + + /* Add new key */ + if (num_categories < MAX_CATEGORIES) { + strncpy(counts[num_categories].key, key, 31); + counts[num_categories].key[31] = '\0'; + counts[num_categories].count = 1; + num_categories++; + } +} +``` + +This is a simple associative array. For small datasets, linear search is fine. Larger datasets would benefit from a hash table. + +--- + +### cndel - File Rewriting + +**Purpose:** Remove entries by moving them to an archive file. + +#### The Challenge + +You cannot delete lines from the middle of a file in C. Instead: +1. Read all entries into memory +2. Write non-deleted entries to a temporary file +3. Append deleted entries to archive +4. Replace original with temporary + +#### Safe File Replacement + +```c +/* Read all entries */ +Entry entries[MAX_ENTRIES]; +int count = read_all_entries(entries, source_path); + +/* Open files */ +FILE *temp = fopen(temp_path, "w"); +FILE *archive = fopen(archive_path, "a"); + +/* Write entries to appropriate files */ +for (i = 0; i < count; i++) { + if (should_delete(&entries[i])) { + write_entry(archive, &entries[i]); + deleted_count++; + } else { + write_entry(temp, &entries[i]); + } +} + +fclose(temp); +fclose(archive); + +/* Replace original with temp */ +remove(source_path); +rename(temp_path, source_path); +``` + +**Why archive instead of delete?** The immutable-log philosophy means data is never truly lost - it's just moved to a different file. + +#### Confirmation Prompts + +```c +char response[10]; + +printf("Delete %d entries? (y/n): ", count); +fflush(stdout); /* Ensure prompt appears before input */ + +if (fgets(response, sizeof(response), stdin) != NULL) { + if (response[0] == 'y' || response[0] == 'Y') { + /* Proceed with deletion */ + } +} +``` + +`fflush(stdout)` ensures the prompt is displayed before waiting for input. Without it, buffered I/O might delay the prompt. + +--- + +## Memory Management + +### The Golden Rules + +1. **Check malloc() return value** + ```c + ptr = malloc(size); + if (ptr == NULL) { + /* Handle error */ + } + ``` + +2. **Free what you allocate** + ```c + Entry *entries = malloc(...); + /* ... use entries ... */ + free(entries); /* Always free before return */ + ``` + +3. **Don't use after free** + ```c + free(entries); + entries = NULL; /* Prevent accidental use */ + ``` + +4. **Match allocations to deallocations** + Every `malloc()` needs exactly one `free()`. + +### Stack vs Heap + +| Stack | Heap | +|-------|------| +| Automatic allocation | Manual allocation | +| Fixed size (~64KB DOS, ~1MB modern) | Limited by system memory | +| Fast allocation | Slower allocation | +| Automatic cleanup | Must call `free()` | + +```c +void function(void) { + char buffer[100]; /* Stack - automatic */ + char *data = malloc(100); /* Heap - manual */ + + /* buffer freed automatically when function returns */ + free(data); /* Must free explicitly */ +} +``` + +--- + +## String Handling in C90 + +### String Basics + +C strings are arrays of `char` terminated by `'\0'` (null character). + +```c +char str[10] = "Hello"; +/* Memory: ['H','e','l','l','o','\0',?,?,?,?] */ +/* 0 1 2 3 4 5 6 7 8 9 */ +``` + +### Safe String Functions + +| Unsafe | Safe | Notes | +|--------|------|-------| +| `gets()` | `fgets()` | Always use fgets | +| `strcpy()` | `strncpy()` | Specify max length | +| `sprintf()` | `snprintf()`* | *Not in C90 | + +**strncpy() gotcha:** + +```c +char dest[10]; +strncpy(dest, source, 9); +dest[9] = '\0'; /* strncpy may not null-terminate! */ +``` + +If `source` is longer than 9 characters, `strncpy()` won't add a null terminator. Always add it manually. + +### String Length vs Buffer Size + +```c +char buffer[100]; /* Buffer SIZE is 100 */ +strcpy(buffer, "Hello"); +/* String LENGTH is 5 (not counting '\0') */ +/* strlen(buffer) returns 5 */ +``` + +Always allocate `strlen(str) + 1` bytes for a copy. + +--- + +## Cross-Platform Considerations + +### Line Endings + +| System | Line Ending | +|--------|-------------| +| Unix/Linux/macOS | `\n` (LF) | +| Windows | `\r\n` (CRLF) | +| Classic Mac | `\r` (CR) | + +When reading with `fgets()`, the line ending is included. You may need to strip it: + +```c +char *newline = strchr(line, '\n'); +if (newline) *newline = '\0'; + +char *cr = strchr(line, '\r'); +if (cr) *cr = '\0'; +``` + +### Path Separators + +Handled by `PATH_SEPARATOR` and `PATH_SEP_STR` macros in `platform.h`. + +### Environment Variables + +| System | Home Directory | +|--------|----------------| +| Unix | `HOME` | +| Windows | `USERPROFILE` | +| DOS | None standard | + +The `HOME_ENV` macro abstracts this. + +### Integer Sizes + +C90 only guarantees minimums: +- `char`: at least 8 bits +- `short`: at least 16 bits +- `int`: at least 16 bits +- `long`: at least 32 bits + +For portable code, don't assume `int` is 32 bits (it's 16 bits on DOS). + +--- + +## Summary + +The cnotes codebase demonstrates several important C90 patterns: + +1. **File I/O**: Opening, reading line-by-line, writing formatted data, closing +2. **Parsing**: State-machine approach with pointer advancement +3. **Memory**: malloc/free for large data, stack for small buffers +4. **Strings**: Careful length tracking, null termination +5. **Portability**: Preprocessor conditionals for platform differences +6. **Error Handling**: Check every return value + +These patterns form the foundation of robust C programming and are still relevant in modern systems programming.