I have a file, with around 85 million json records. The file size is around 110 Gb. I want to read from this file in batches of 1 million (in sequence). I am trying to read from this file line by line using a scanner, and appending these 1 million records. Here is the code gist of what I am doing:
var rawBatch []string batchSize := 1000000 file, err := os.Open(filePath) if err != nil { // error handling } scanner = bufio.NewScanner(file) for scanner.Scan() { rec := string(scanner.Bytes()) rawBatch = append(rawBatch, string(recBytes)) if len(rawBatch) == batchSize { for i := 0; i < batchSize ; i++ { var tRec parsers.TRecord err := json.Unmarshal(rawBatch[i], &tRec) if err != nil { // Error thrown here } } //process rawBatch = nil } } file.Close()
Sample of correct record:
type TRecord struct { Key1 string `json:"key1"` key2 string `json:"key2"` } {"key1":"15","key2":"21"}
The issue I am facing here is that while reading these records, some of these records are getting corrupted, example: changing a colon to semi colon, or double quote to #. Getting this error:
Unable to load Record: Unable to load record in: {"key1":#15","key2":"21"} invalid character '#' looking for beginning of value
Some observations:
- Once we start reading, the contents of the file itself get corrupted.
- For every batch of 1 million, I saw 1 (or max 2) records getting corrupted. Out of 84 million records, a total of 95 records were corrupted.
- My code is working for for a file with size around 42Gb (23 million records). With a higher sized data file, my code is behaving erroneously.
- ‘:’ are changing to ‘;’. Double quotes are changing to ‘#’. Space is changing to ‘!’. All these combinations, in their binary representations, have a single bit difference. Any chance that we might have some accidental bit manipulation?
Any ideas on why this is happening? And how can I fix it?
Details:
- Go version used: go1.15.6 darwin/amd64
- Hardware details: Debian GNU/Linux 9.12 (stretch), 224Gb RAM, 896Gb Hard disk
Advertisement
Answer
As suggested by @icza in the comments,
That occasional, very rare 1 bit change suggests hardware failure (memory, processor cache, hard disk). I do recommend to test it on another computer.
I tested my code on some other machines. The code is running perfectly fine now. Looks like this occasional rare bit change, due to some hard failure, was causing this issue.