diff options
| author | Thorsten Töpper <atsutane@freethoughts.de> | 2025-08-10 18:16:07 +0200 |
|---|---|---|
| committer | Thorsten Töpper <atsutane@freethoughts.de> | 2025-08-10 18:16:07 +0200 |
| commit | 9e2f3d59cf249403859916df9756c179753ea7e0 (patch) | |
| tree | 6aaacfd22fc681fb7d95826ef65726c392cfc7d8 /include/output.h | |
| parent | 5b743929d23ca0e8004fe2d6bc8ff5c04ed9dbb9 (diff) | |
| download | small-utils-9e2f3d59cf249403859916df9756c179753ea7e0.tar.gz small-utils-9e2f3d59cf249403859916df9756c179753ea7e0.tar.bz2 | |
split_for_sort: Split a given file into buckets
The target bucket is decided based on the first X characters of a line.
The bucket name gets a prefix defined as argument and can be sorted
faster on weak hardware. Note: This is just a split alternative.
Real world usage in a shell script with a file in which the first 10
characters are the equal in each line, the following 2 bytes are
evaluated for splitting:
split_for_sort TMPSFS 12 raw_data.txt
for f in TMPSFS ; do
sort -o "${f}_sorted" -u "${f}"
done
\# Rely on the argument resolution to go with lexical order
cat TMPSFS*_sorted > sorted_data.txt
rm TMPSFS*
Diffstat (limited to 'include/output.h')
| -rw-r--r-- | include/output.h | 20 |
1 files changed, 20 insertions, 0 deletions
diff --git a/include/output.h b/include/output.h new file mode 100644 index 0000000..efc7487 --- /dev/null +++ b/include/output.h @@ -0,0 +1,20 @@ +/* + * vim:ts=4:sw=4:expandtab + */ +#ifndef OUTPUT_H +#define OUTPUT_H + +#include <stdio.h> + +#ifndef LOGERR +#define LOGERR(...) {fprintf(stderr, "[%s:%d] %s: ", __FILE__, __LINE__, __func__); fprintf(stderr, __VA_ARGS__);} +#endif + +#ifdef DEBUGBUILD +#define DBGTRC(...) LOGERR(__VA_ARGS__) +#else +#define DBGTRC(...) +#endif + +#endif + |
