decomet is a cosmic fast Windows command line tool that will minify source code. By default this will remove all comment lines starting with // or contained with /**/.
It features;
- remove all blank/empty lines. Blank is defined as whitespace*
- remove all Unicode control characters, except tab, form feed and carriage return.
- remove empty duplicate lines reduce them to 1 line for readability of code.
- remove indent whitespace*
- minify and normalize whitespace* to a single space
- prefix with line number. Specifically line number, tab and then line
- recurses sub-directories
- funnel to a single output directory
- added datetime stamp to output filename *udpate as of Mon 10-Jun-24 2:57pm EDT
*ISO 30112 defines POSIX whitespace characters for function iswspace() for locale 'en_US.UTF8' as Unicode characters U+0009..U+000D, U+0020, U+1680, U+180E, U+2000..U+2006, U+2008..U+200A, U+2028, U+2029, U+205F, and U+3000
It super fast and written in C/C++ mixture.
It reads and writes UTF-8 source code.
Files accept Unicode names.
Built-in human readable elapsed time.
The code base for this project is http://code.google.com/p/cpp-decomment/ but has been greatly improved to handle Unicode spaces, control characters, UTF-8 files and UTF-8 filename. Moreover, the state machine has been optimized and improved to work.
Improved code to make sure all the switches actually work.
Download decomet.zip. Personal use only. Opens this page after run.
Use metadataconsult@gmail.com for license, $10.00 USD to remove open page.As with all my software - 100% no malware or spyware. I am trying to sell this and that would be a bad idea.
decomet -h 2> help.txt - to pipe to a file 'help.txt'
Usage: decomet -[bcehimnprsv] [-d<DIR>] file1.c file2.js ... Outputs (adds extension .dec.{org ext}) file1.c.dec.c file2.js.dec.js ... Decomment source files, optionally remove whitespace, control characters and duplicate empty lines -b remove all whitespace* blank/empty lines -c preprocess & remove control characters in ASCII and UNicode range U+0001..U+0008, U+000E..U+001F and U+007F..U+009F, respectively. NOTE: U+001A 'SUB' Substitute character will terminate reading a text file unexpectedly. -e Removes duplicate Unicode whitespace* entire lines aka 'empty lines', leaving 1 line. *ISO 30112 defines POSIX whitespace characters for function iswspace() for locale 'en_US.UTF8' as Unicode characters U+0009..U+000D, U+0020, U+1680, U+180E, U+2000..U+2006, U+2008..U+200A, U+2028, U+2029, U+205F, and U+3000 -h display help message -i remove indent whitespace* -m minify && normalize whitespace* to a single space -n prefix with line number -p preview files matching wildcard for recursive search -r recursive search sub-dirs under the input-file's folder - file wildcard needed -s output to stdout, instead of output-files (infile1.c.dec.c) -t add datetime to output. Example: infile1.c.dec_10Jun24_1347PM.c -v switch off verbose - default on -d<DIR> output funnel directory, no space after -d file[*?].c input-files, file wildcard [?*] allowed. The output-file is 'filexxx.c.dec.c' Features: Fast, written in mainly C, C++ for Unicode support Read and writes UTF-8 text files Implements a state machine for parsing to remove comments, enforce min. spaces, etc. Implements a stack for file/folder traversal Limitations: Each line length is a max of 100,000 characters wide Does not handle long file paths (>260) Notes: org src code - http://code.google.com/p/cpp-decomment/ improved to handle Unicode, UTF-8 files && remove duplicate lines, Unicode whitespace fixed stack imp (org. failed if single double quote found with -m switch) improved to assure all switches work correctly, etc. decomet version 2.0.2.4 Copyright © 2024 metadataconsulting.ca, Mark Pahulje THE SOFTWARE IS PROVIDED "AS IS", EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Speed test on 100 lines of C++ file.
1. input2.cpp Input 100 lines. Output 100 lines. Removed 0 lines. Elapsed 3ms.
Speed test on 1 Gig text file.
I:\WORK-CODE\Visual Studio Projects\decomment\Debug>decomet -e 1gb.txt Input 42949674 lines. Output 42949670 lines. Removed 4 lines. Elapsed 6min 29s 170ms.
No comments:
Post a Comment