Would like to SLICE a huge json file ~20GB into smaller chunk of data based on array size (10000/50000 etc).. Input: Currently running in a loop to get the desire output by incrementing x/y value, but performance is very slow and takes very 8-20 seconds for a iteration depends on size of the file to complete …
Tag: split
rename files which produced by split
I splitted the huge file and output is several files which start by x character. I want to rename them and make a list which sorted by name like below: part-1.gz part-2.gz part-3.gz … I tried below CMD: for (( i = 1; i <= 3; i++ )) ;do for f in `ls -l | awk ‘{print $9}’ | grep
How to split file based on first character in Linux shell
I do have a fixedwidth flatfile with the header and detail data. Both of them can be recognized by the first character: 1 for header and 2 for detail. I want to genrate 2 different files from my fixedwidth file , each file having it’s own record set, but without type record written. File Header.txt havi…
python way to find unique version of software
I’ve multiple components of a software (let’s call it XYZ) installed on my linux (RHEL 6.2) server running python 3.3. I’m trying to covert my install/upgrade script from shell to python. For that I need to fetch the version number, but only once. In my python script I’ve added the bel…
Split string in ksh
I got a string as follow : What would be the best way to convert it in seconds without using date ? I tried something like this that sounded pretty nice but no luck : Any idea is welcome, I just can’t use date -d to make conversion as it is not present on the system I am working on.
Having some issues with Perl Splitting and Merging Functions
First and foremost, I’m not familiar with Perl at all. I’ve been studying C++ primarily for the last 1/2 year. I’m in a class now that that is teaching Linux commands, and we have short little topics on languages used in Linux, including Perl, which is totally throwing me for a loop (no pun …
How to split a multi-gigabyte file into chunks of about 1.5 gigabytes using Linux split?
I have a file that can be bigger than 4GB. I am using the linux split command to split it by lines (that’s the requirement). But after splitting the original file, I want the size of split file to be always less than 2GB. The original file size can vary from 3-5 GB. I want to write some logic for
Why linux split program have weird behavior with large files >20GB?
I’m doing the next statement on my ubuntu: If i do a “ls” If i do a “du” over ab and ad files. As you can see, split divided the file in a non-homogeneous form. Anyone know what’s going on? Some unprintable character can hang the split? Thank you. Best Regards! Francisco. A…
Splitting gzipped logfiles without storing the ungzipped splits on disk
I have a recurring task of splitting a set of large (about 1-2 GiB each) gzipped Apache logfiles into several parts (say chunks of 500K lines). The final files should be gzipped again to limit the disk usage. On Linux I would normally do: The resulting files files will be named xaa, xab, xac, etc So I do: The…