I have a python script that outputs file with similar text:
1234insert into\defaulte72303FINISHEDFalse23msN/A37m10s1052017- 08-23 09:55:10.1554070002017-08-23 09:55:10.178453000
This data is split by “” and is imported to a table in hive database.
My issue that some of that data contains ^M carriage return character which splits up my data:
1234INSERT INTO customer_touch.XXX_test_data_pickup^M (^M CIdefaulte72303FINISHEDFalse331ms / 0 ( 0%)37m11s2017-08- 23 09:55:08.0666200002017-08-23 09:55:08.398299000
I need to remove ^M and have my data all together. I have tried dos2unix on the filename which does remove ^M but my data is still split.
Below is my code. I have crontab setup that outputs this into a text file
datanodes = ["https://XXXXXXX/", "https://XXXXXXX"] for i, datanode in enumerate(datanodes): try: response = requests.get(datanode + "queries? json",auth=HTTPDigestAuth(XXX, XXX),verify='XXXX.pem') data = response.json() for query in data['completed_queries']: print query['query_id'] + "\" + query['stmt'][0:80] + "\" + query['default_db'] + "\" + query['effective_user'] + "\" + query['state'] + "\" + str(query['executing']) + "\" + query['duration'] + "\" + query['progress'] + "\" + query['waiting_time'] + "\" + str(query['rows_fetched']) + "\" + query['start_time']+ "\" + query['end_time'] except IOError as ioe: print ioe except Exception as e: print(e)
Advertisement
Answer
I was able to remove ^M with replace(‘r’, ”) per Charles Duffy’s suggestion. I changed my code to query[‘stmt’][0:80].replace(‘r’, ”)