$cat stat1
567.20 0.88
45.29 3.08
296.58 21.50
0.33 0.14
The first column are some properties in a particular simulation run, and second column is the standard error. The "N" different "stat" files are N independent simulation runs. When I finally report, I like to report the average properties and associated standard errors. The following shell script DataAgg.sh creates a new file TotalProp which contains exactly that.
$cat TotalProp
567.49 0.24
43.57 0.45
289.91 1.61
0.67 0.10
The shell script is here:
$cat DataAgg.sh
i=0
for s in stat*
do
let i=i+1
if [ $i == 1 ]; then
awk '{print $1}' $s > TmpProp
awk '{print $2*$2}' $s > TmpErr2Prop
else
awk '{print $1}' $s > tmp
paste tmp TmpProp > more
awk '{print $1+$2}' more > TmpProp
awk '{print $2}' $s > tmp
paste tmp TmpErr2Prop > more
awk '{print $1+$2}' more > TmpErr2Prop
fi
done
awk '{print $1/n}' n=$i TmpProp > more; mv more TmpProp
awk '{print sqrt($1)/n}' n=$i TmpErr2Prop > more; mv more TmpErr2Prop
paste TmpProp TmpErr2Prop > more
awk '{printf("%6.2f\t%6.2f\n",$1, $2)}' more > TotalProp
rm -f TmpProp
rm -f TmpErr2Prop
rm -f more
rm -f tmp
Note I don't need to know how many "stat"s there are, and how many rows each of the "stat"s has. The only precondition is that I know what the common prefix ("stat") of my datafiles is, and that those files contain only the two numerical columns mentioned above.
No comments:
Post a Comment