Friday, September 01, 2017

Splitting a Space Separated Dataset with awk

I thought I would publish a simple script on how I split a dataset that looks like below into a CSV file which Excel could work on. The issue I was having was that Excel couldn't seem to find the spaces and use them. I would end up with extra data in columns on my Mac. Perhaps a flaw in the Excel program, or a flaw in the data, or both. However, awk digested it just fine.
1
2
3
4
5
6
7
8
9
10
11
08/31/17 14:00:00 780 10
08/31/17 14:00:01 789 9
08/31/17 14:00:02 817 28
08/31/17 14:00:03 868 51
08/31/17 14:00:04 927 59
08/31/17 14:00:05 986 59
08/31/17 14:00:06 1031 45
08/31/17 14:00:07 1062 31
08/31/17 14:00:08 1107 45
08/31/17 14:00:09 1150 43
08/31/17 14:00:10 1206 56
So here is my script to make it a little more useful for Excel. Mind you it is simple, but there are 1.7 million lines of logging, and it took about a second on my machine to convert. Slick by any measure. The problem now is that Excel is choking on the number of records to display in a chart.
1
awk '{printf("%s,%s,%s,%s\n",$1,$2,$3,$4)}' synthinet.log > synthinet.csv
1
2
3
4
5
6
7
8
9
10
09/01/17,07:38:04,805772,0
09/01/17,07:38:05,805772,0
09/01/17,07:38:06,805772,0
09/01/17,07:38:07,805772,0
09/01/17,07:38:08,805772,0
09/01/17,07:38:09,805772,0
09/01/17,07:38:10,805772,0
09/01/17,07:38:11,805772,0
09/01/17,07:38:12,805772,0
09/01/17,07:38:13,805772,0

Popular Posts