Quantitative Analysis, Risk Management, Modelling, Algo Trading, and Big Data Analysis

Trend Identification for FX Traders (Part 2)


In my previous post I provided an introduction to the trading model invention and design. We made use of FX data of AUDUSD pair sampled hourly and splitting data into weekly time-series.

Today, we will work a bit harder over formulation of the very first rules for the model. This step will require an engagement of our creativity in understanding what data are like as well as how we can make a basic set of rules which would help us to perform an attractive time-series classification. Our objective is to invent a method which will be helpful in classification of last week FX pair’s time-series to be either in the downtrend or in the uptrend.

The most naive way of classification of directional information contained in any time-series is its slope: a straight line fit to the data. Let’s use it as our starting point. Instead of fitting all data points for a given week, we find median values for the first and the last 12 data points both in Time $(x1,x2)$ and Pair Ratio $(y1,y2)$ as specified in lines 92 to 94:

65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
% FX time-series analysis
% (c) Quant at Risk, 2012
%
% Part 2: Classification of weekly time-series
 
close all; scrsz = get(0,'ScreenSize');
h=figure('Position',[70 scrsz(4)/2 scrsz(3)/1.1 scrsz(4)/2],'Toolbar','none');
fprintf('\nuptrend/downtrend identification.. ');
% for viewing use the loop
hold off;
set(0,'CurrentFigure',h);
 
% pre-define variables
trend=zeros(nw,1);
slope=zeros(nw,1);
midp={};  % middle points
endp={};  % end points (median based on last 12 points)
 
for i=1:nw  %--- a loop over total number of weeks available
 
    % reading time-series for a current week 
    w=week{i}; 
    x=w(:,1); y=w(:,2);
 
    % plot the time-series
    hold on; plot(x,y,'k');
 
    % linear trend estimation
    x1=median(x(1:12)); x2=median(x(end-11:end));
    y1=median(y(1:12)); y2=median(y(end-11:end));
 
    % define end-point of the time-series and mark it on the plot
    endp{i}=[x2 y2];
    hold on; plot(endp{i}(1),endp{i}(2),'b*');
 
    % find slope
    m=(y2-y1)/(x2-x1);
    slope(i)=m;
    xl=x1:dt:x2;       
    yl=m*xl-m*x2+y2;   % a line of representing the slope
    hold on; plot(xl,yl,'b:');
 
    % find middle point of the line and mark it on the plot
    mx=mean(xl);
    my=mean(yl);
    midp{i}=[mx my];
    hold on; plot(midp{i}(1),midp{i}(2),'bo');

As an example of the code execution, for the first two weeks we plot slopes, mid-points and end-points:

We assume that our classification procedure will be based solely on the information provided for end-points and slopes. The exception we make for the classification of the first two weeks. For the first week the distinction between uptrend and downtrend is made based on the position of the first and last point:

113
114
115
116
117
118
119
120
121
122
123
124
    % Time-Series Classification
 
    if(i==1)
        ybeg=y(1); yend=y(end);
        if(ybeg<yend)
            trend(i)=+1; % uptrend
            hold on; plot(x,y,'k');
        else
            trend(i)=-1; % downtrend
            hold on; plot(x,y,'r');
        end
    end

where we mark the result of classification with a sign $+1$ or $-1$ in a vector element of $trend$, and plot it with black and red color denoting uptrend and downtrend, respectively.

For the second week, our rules of classification are enriched by additional information about the end-point of a current and a previous week:

126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
    if(i==2)
        % week(current-1)
        tmp=week{i-1};
        x1=tmp(:,1); y1=tmp(:,2);
        y1b=y1(1); y1e=y1(end);
        % week(current)
        y0b=y(1); y0e=y(end);
        if(y0e>y1e)
            trend(i)=+1; % uptrend
            hold on; plot(x,y,'k');
        else
            trend(i)=-1; % downtrend
            hold on; plot(x,y,'r');
        end
    end

For weeks number 3 and higher we do our creative research over the data to define a specific set of rules. We allow to take into account the information from two weeks prior to the current one and combine them all together. The following code represents an attractive solution, subject to improvement:

142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
    if(i>2)
        % week(current-2)
        mid2=midp{i-2}(2);
        end2=endp{i-2}(2);
        slp2=slope(i-2);
        % week(current-1)
        mid1=midp{i-1}(2);
        end1=endp{i-1}(2);
        slp1=slope(i-1);
        % week(current)
        mid0=midp{i}(2);
        end0=endp{i}(2);
        slp0=slope(i);
        if((mid0>mid1))                     % up-trend
            if((mid0>mid2)&&(end0>end1))
                trend(i)=+1;
                hold on; plot(x,y,'k');    % strong up-trend
            elseif((mid0>mid2)&&(end0>end2))
                trend(i)=+1;
                hold on; plot(x,y,'k');    % weak up-trend
            elseif((mid0<mid2)&&(end0<end2)&&(slp0<0))
                trend(i)=-1;
                hold on; plot(x,y,'r');    % turns into possible down-trend
            elseif((mid0<mid2)&&(end0<end2)&&(slp0>0))
                trend(i)=+1;
                hold on; plot(x,y,'k');    % turns into possible up-trend
            else
                trend(i)=+1;                
                hold on; plot(x,y,'k');    % turns into possible up-trend
            end
        elseif(mid0<mid1)                  % down-trend
            if((mid0<mid2)&&(end0<end1)&&(end0<end2))
                trend(i)=-1;
                hold on; plot(x,y,'r');    % weak down-trend
            elseif((mid0<mid2)&&(end0<end2)&&(end0>end1))
                trend(i)=+1;
                hold on; plot(x,y,'k');    % possible up-trend
            elseif((mid0<mid2)&&(end0>end2))
                trend(i)=+1;
                hold on; plot(x,y,'k');    % turns into possible up-trend
            elseif((mid0>mid2)&&(end0<end1)&&(end0<end2))
                trend(i)=-1;
                hold on; plot(x,y,'r');
            elseif((mid0>mid2)&&(end0>end2))
                trend(i)=+1;
                hold on; plot(x,y,'k');    % turns into possible up-trend
            elseif((mid0>mid2)&&(end0>end1))
                trend(i)=+1;
                hold on; plot(x,y,'k');    % turns into possible up-trend
            else
                trend(i)=-1;                 
                hold on; plot(x,y,'r');
            end
        end
    end
end

Since one picture is worth millions of lines of code, below we present three examples of our model in action. The last plot corresponds to the latest Global Financial Crisis and shows how weeks in uptrends of 2009 followed these in downtrend a year before.

It is straightforward to note that the performance of our rules works very intuitively and stays in general agreement with the market sentiments.

Trend Identification for FX Traders


When you think about an invention of a new model for algorithmic trading, there are only three key elements you need to start your work with: creativity, data, and programming tool. Assuming that the last two are already in your possession, all what remains is seeking and finding a great new idea! With no offense, that’s the hardest part of the game.

To be successful in discovering new trading solutions you have to be completely open-minded, relaxed and full of spatial orientation with the information pertaining to your topic. Personally, after many years of programming and playing with the digital signal processing techniques, I have discovered that the most essential aspect of well grounded research is data itself. The more, literally, I starred at time-series changing their properties, the more I was able to capture subtle differences, often overlooked by myself before, and with the aid of intuition and scientific experience some new ideas simply popped up.

Here I would like to share with you a part of this process.

In Extracting Time-Series from Tick-Data article I outlined one of many possible ways of the FX time-series extraction from the very fine data sets. As a final product we have obtained two files, namely:

audusd.bid.1h
audusd.ask.1h

corresponding to Bid and Ask prices for Forex AUDUSD pair’s trading history between Jan 2000 and May 2010. Each file contained two columns of numbers: Time (Modified Julian Day) and Price. The time resolution has been selected to be 1 hour.

FOREX trading lasts from Monday to Friday, continuously for 24 hours. Therefore the data contain regular gaps corresponding to weekends. As the data coverage is more abundant comparing to, for example, much shorter trading windows of equities or ETFs around the world, that provides us with a better understanding of trading directions within every week time frame. Keeping that in mind, we might be interested in looking at directional information conveyed by the data as a seed of a potential new FX model.

As for now, let’s solely focus on initial pre-processing of Bid and Ask time-series and splitting each week into a common cell array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
% FX time-series analysis
% (c) Quant at Risk, 2012
%
% Part 1: Separation of the weeks
 
close all; clear all; clc;
 
% --analyzed FX pair
pair=['audusd'];
 
% --data
n=['./',pair,'/',pair];     % a common path to files
na=[n,'.ask.1h']; 
nb=[n,'.bid.1h'];
d1=load(na); d2=load(na);   % loading data
d=(d1+d2)/2;                % blending
clear d1 d2

For a sake of simplicity, in line 16, we decided to use a simple average of Bid and Ask 1-hour prices for our further research. Next, we create a weekly template, $x$, for our data classification, and we find the total number of weeks available for analysis:

19
20
21
22
23
24
25
26
27
28
29
30
31
% time constraints from the data
t0=min(d(:,1));
tN=max(d(:,1));
t1=t0-1;
 
% weekly template for data classification
x=t1:7:tN+7;
 
% total number of weeks
nw=length(x)-1;
 
fprintf(upper(pair));
fprintf(' time-series: %3.0f weeks (%5.2f yrs)\n',nw,nw/52);

what in our case returns a positive information:

AUDUSD time-series: 539 weeks (10.37 yrs)

The core of programming exercise is to split all 539 weeks and save them into a cell array of $week$. As we will see in the code section below, for some reasons we may want to assure ourselves that each week will contain the same number of points, therefore any missing data from our FX data provider will be interpolated. To do that efficiently, we use the following function which makes use of Piecewise Cubic Hermite Interpolating Polynomial interpolation for filling gapped data point in the series:

function [x2,y2]=gapinterpol(x,y,dt);
    % specify axis
    x_min=x(1);
    x_max=x(length(x));
    x2=(x_min:dt:x_max);
    % inperpolate gaps
    y2=pchip(x,y,x2);
end

The separation of weeks we realize in our program by:

33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
week={};  % an empty cell array
avdt=[]; 
 
for i=1:nw
    % split FX signal according to week
    [r,c,v]=find(d(:,1)>x(i) & d(:,1)<x(i+1));
    x1=d(r,1); y1=d(r,2);
    % interpolate gaps, use 1-hour bins
    dt=1/24;
    [x2,y2]=gapinterpol(x1,y1,dt);
    % check the average sampling time, should equal to dt
    s=0;
    for j=1:length(x2)-1
        s=s+(x2(j+1)-x2(j));
    end
    tmp=s/(length(x2)-1);
    avdt=[avdt; tmp];
    % store the week signal in a cell array
    tmp=[x2; y2]; tmp=tmp';
    week{i}=tmp;
end
fprintf('average sampling after interpolation = %10.7f [d]\n',max(avdt));

where as a check-up we get:

average sampling after interpolation =  0.0416667 [d]

what corresponds to the expected value of $1/24$ day with a sufficient approximation.

A quick visual verification of our signal processing,

54
55
56
57
58
59
60
61
62
63
scrsz = get(0,'ScreenSize');
h=figure('Position',[70 scrsz(4)/2 scrsz(3)/1.1 scrsz(4)/2],'Toolbar','none');
hold off;
for i=1:nw
    w=week{i};
    x=w(:,1); y=w(:,2);
    % plot weekly signal
    hold on; plot(x,y,'k');
end
xlim([0 100]);

uncovers our desired result:

AUD/USD

Extracting Time-Series from Tick-Data


The tick-data provided in the .csv (comma separated values) file format sometimes may be a real problem to handle quickly, especially when the total size starts to count in hundreds of GB. If your goal is to extract a time-series with, say, hourly time resolution only, this article will provide you with some fresh and constructive guidelines how to do it smartly in the Linux environment.

First, let’s have a closer look at the data. Say, we have a collection of 2148 .csv files hosting the FX trading history of AUDUSD pair, covering nearly 10 years between 2000 and 2010. Each file is 7.1 MB large what leaves us with approximately 15 GB of data to process. Having a look into the randomly selected file we can identify the header and data themselves:

$ head -10 audusd_216.csv 
Ticks,TimeStamp,Bid Price,Bid Size,Ask Price,Ask Size
632258349000000015,2004-07-19 11:55:00.000,0.7329,1000000,0.7334,1000000
632258349000000016,2004-07-19 11:55:00.000,0.7329,1000000,0.7334,1000000
632258349000000017,2004-07-19 11:55:00.000,0.7329,1000000,0.7333,1000000
632258349000000018,2004-07-19 11:55:00.000,0.7327,1000000,0.7333,1000000
632258349000000019,2004-07-19 11:55:00.000,0.7327,1000000,0.7333,1000000
632258349000000020,2004-07-19 11:55:00.000,0.7328,1000000,0.7333,1000000
632258349000000021,2004-07-19 11:55:00.000,0.7328,1000000,0.7334,1000000
632258349600000000,2004-07-19 11:56:00.000,0.7328,1000000,0.7334,1000000
632258349600000001,2004-07-19 11:56:00.000,0.7328,1000000,0.7336,1000000

Our aim will be to extract Bid and Ask Price time-series. We will make use of a few Linux standard tools, e.g. sed, awk, supplemented with extra f77 codes. It is also to demonstrate how shell programming can be useful while we have an opportunity to explore the enigmatic syntax of its programs. Generally, we will be writing a shell script, executable for any FX pair name, e.g. gbpnzd, eurjpy, and so on.

In the first step of the script we create a list of all files. This is tricky in Linux as the standard command of ‘ls -lotr’ though returns a desired list but also all details on the file size, attributes, etc. which we do not simply want. Lines 9 and 10 solve the problem,

1
2
3
4
5
6
7
8
9
10
# Extracting Time-Series from Tick-Data .csv files
# (c) Quant at Risk, 2012
#
# Exemplary usage: ./script.src audusd
 
#!/bin/bash
 
echo "..making a sorted list of .csv files"
for i in $1_*.csv; do echo ${i##$1_} $i ${i##.csv}; 
done | sort -n | awk '{print $2}' > $1.lst

and a newly create file of \$1.lst (note: \$1 corresponds in the shell script to the parameter’s name we called the script with, e.g. audusd; therefore \$1.lst physically means audusd.lst) contains the list:

audusd_1.csv
audusd_2.csv
audusd_3.csv
...
audusd_2148.csv

We create one data file from all 2148 pieces by creating and executing an in-line script:

12
13
14
15
16
17
echo "..creating one data file"
awk '{print "cat",$1," &gt;&gt; tmp.lst"}' $1.lst &gt; tmp.cmd
chmod +x tmp.cmd
./tmp.cmd
rm tmp.cmd
mv tmp.lst $1.tmp

Now, \$1.tmp is a 15 GB file and we may wish to remove some unnecessary comments and tokens:

19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
echo "..removing comments"
sed 's/Ticks,TimeStamp,Bid Price,Bid Size,Ask Price,Ask Size//g' $1.tmp > $1.tmp2
rm $1.tmp
 
echo "..removing empty lines"
sed '/^$/d' $1.tmp2 > $1.tmp
rm $1.tmp2
 
echo "..removing token ,"
sed 's/,/ /g' $1.tmp > $1.tmp2
rm $1.tmp
 
echo "..removing token :"
sed 's/:/ /g' $1.tmp2 > $1.tmp
rm $1.tmp2
 
echo "..removing token -"
sed 's/-/ /g' $1.tmp > $1.tmp2
rm $1.tmp
 
echo "..removing column with ticks and ask/bid size"
awk '{print $2,$3,$4,$5,$6,$7,$8,$10}' $1.tmp2 > $1.tmp
rm $1.tmp2

In order to convert a time information into a continuous measure of time, we modify the f77 code for our task as follows:

c Extracting Time-Series from Tick-Data .csv files
c (c) Quant at Risk, 2012
c
c Program name: fx_getmjd.for
c Aim: removes ticks and coverts trade time into MJD time [day]
c Input data format: YYYY MM DD HH MM SS.SSS BID BID_Vol ASK ASK_Vol
 
      implicit double precision (a-z)
      integer y,m,d,hh,mm,jd
      integer*8 bidv,askv
      character zb*50
 
      call getarg(1,zb)
      open(1,file=zb)
      do i=1.d0, 500.d6
        read(1,*,end=1) y,m,d,hh,mm,ss,bid,ask
        jd= d-32075+1461*(y+4800+(m-14)/12)/4+367*(m-2-(m-14)/12*12)
     _    /12-3*((y+4900+(m-14)/12)/100)/4
        mjd=(jd+(hh-12.d0)/24.d0+mm/1440.d0+ss/86400.d0)-2400000.5d0
        mjd=mjd-51544.d0 ! T_0 = 2000.01.01 00:00
        abr=ask/bid
        write(*,2) mjd,bid,ask,abr
      enddo                             
    1 close(1)
    2 format(F15.8,F8.4,F8.4,F12.6)
 
end

and execute it in our script:

43
44
45
echo "..changing a date to MJD"
fx_getmjd $1.tmp > $1.dat
rm $1.tmp

In the aforementioned f77 code, we set a zero time point (MJD=0.00) on Jan 1, 2000 00:00. Since that day, now our time is expressed as a single column measuring time progress in days with fractional parts tracking hours and minutes.

We may split the data into two separate time-series containing Bid and Ask Prices at the tick-data level:

47
48
49
echo "..splitting into bid/ask/abr files"
awk '{print $1,$2}' $1.dat > $1.bid
awk '{print $1,$3}' $1.dat > $1.ask

A quick inspection of both files reveals we deal with nearly $500\times 10^6$ lines! Before we reach our chief aim, i.e. rebinning the series with 1-hour time resolution, there is a need to, unfortunately, separate input into 5 parts, each of maximum $100\times 10^6$ lines. The latter may vary depending of RAM memory size available, and if sufficient, this step can be even skipped. We proceed:

51
52
53
54
55
56
echo "..spliting bid/ask/abr into separate files"
fx_splitdat $1 1
fx_splitdat $1 2
fx_splitdat $1 3
fx_splitdat $1 4
fx_splitdat $1 5

where fx_splitdat.for code is given as follows:

c Extracting Time-Series from Tick-Data .csv files
c (c) Quant at Risk, 2012
c
c Program name: fx_splitdat.for
c Exemplary usage: ./fx_splitdat audusd [1,2,3,4,5]
 
      implicit double precision (a-z)
      integer      nc
      character*6  zb
      character*10 zbask,zbbid
      character*16 zb1ask,zb2ask,zb3ask,zb4ask,zb5ask
      character*16 zb1bid,zb2bid,zb3bid,zb4bid,zb5bid
      character*1  par2,st
 
c zb- name of length equal 6 characters only
      call getarg(1,zb)
      call getarg(2,par2)  ! case
 
      write(st,'(a1)') par2
      read(st,'(i1)') nc
 
      zbask=zb(1:6)//'.ask'
      zbbid=zb(1:6)//'.bid'
 
      zb1ask=zb(1:6)//'.ask.part1'
      zb2ask=zb(1:6)//'.ask.part2'
      zb3ask=zb(1:6)//'.ask.part3'
      zb4ask=zb(1:6)//'.ask.part4'
      zb5ask=zb(1:6)//'.ask.part5'
 
      zb1bid=zb(1:6)//'.bid.part1'
      zb2bid=zb(1:6)//'.bid.part2'
      zb3bid=zb(1:6)//'.bid.part3'
      zb4bid=zb(1:6)//'.bid.part4'
      zb5bid=zb(1:6)//'.bid.part5'
 
      open(11,file=zbask)
      open(12,file=zbbid)
 
      if(nc.eq.1) then
        open(21,file=zb1ask)
        open(22,file=zb1bid)
        do i=1.d0, 100.d6
          read(11,*,end=1) mjd_ask,dat_ask
          read(12,*,end=1) mjd_bid,dat_bid
          if((i>=1.0).and.(i<100000001.d0)) then
            write(21,2) mjd_ask,dat_ask
            write(22,2) mjd_bid,dat_bid
          endif
        enddo              
      endif
 
      if(nc.eq.2) then
        open(31,file=zb2ask)
        open(32,file=zb2bid)
        do i=1.d0, 200.d6
          read(11,*,end=1) mjd_ask,dat_ask
          read(12,*,end=1) mjd_bid,dat_bid
          if((i>=100000001.d0).and.(i<200000001.d0)) then
            write(31,2) mjd_ask,dat_ask
            write(32,2) mjd_bid,dat_bid
          endif
        enddo              
      endif
 
      if(nc.eq.3) then
        open(41,file=zb3ask)
        open(42,file=zb3bid)
        do i=1.d0, 300.d6
          read(11,*,end=1) mjd_ask,dat_ask
          read(12,*,end=1) mjd_bid,dat_bid
          if((i>=200000001.d0).and.(i<300000001.d0)) then
            write(41,2) mjd_ask,dat_ask
            write(42,2) mjd_bid,dat_bid
          endif
        enddo              
      endif
 
      if(nc.eq.4) then
        open(51,file=zb4ask)
        open(52,file=zb4bid)
        do i=1.d0, 400.d6
          read(11,*,end=1) mjd_ask,dat_ask
          read(12,*,end=1) mjd_bid,dat_bid
          if((i>=300000001.d0).and.(i<400000001.d0)) then
            write(51,2) mjd_ask,dat_ask
            write(52,2) mjd_bid,dat_bid
          endif
        enddo              
      endif
 
      if(nc.eq.5) then
        open(61,file=zb5ask)
        open(62,file=zb5bid)
        do i=1.d0, 500.d6
          read(11,*,end=1) mjd_ask,dat_ask
          read(12,*,end=1) mjd_bid,dat_bid
          if((i>=400000001.d0).and.(i<500000001.d0)) then
            write(61,2) mjd_ask,dat_ask
            write(62,2) mjd_bid,dat_bid
          endif
        enddo              
      endif
 
    1 close(1)
    2 format(F15.8,F8.4)
 
      stop
      end

and compiling it as usual:

f77 fx_splitdat.for -o fx_splitdat

Finally, we can extract the rebinned Bid and Ask Price time-series with bin time of 1 hour, i.e. $dt=0.041666667$ d, making use of the following f77 code:

c Extracting Time-Series from Tick-Data .csv files
c (c) Quant at Risk, 2012
c
c Program name: fx_rebin.for
c Exemplary usage: ./fx_rebin audusd 2
 
      implicit double precision (a-z)
      parameter (dim=100.d6)
 
      double precision f(dim), mjd(dim), step
      character*50 par1, par2, st
 
      call getarg(1,par1)  ! file name
      call getarg(2,par2)  ! bining [d]
 
      write(st,'(a20)') par2
      read(st,'(f20)') step
 
c reading data
      open(1,file=par1)
      do i=1,100.d6
        read(1,*,end=1) 
     _       mjd(i),f(i)
      enddo
    1 close(1)
      n=i-1.d0
 
c main loop
      j=1.d0
      k=1.d0
      t2=0.
      t2=dint(mjd(j))
      do while (j.lt.n)
        i=j
        if ((mjd(i)+step).gt.(mjd(n))) then
          print*
          stop
        else
          t2=t2+step
        endif
        i=j
        il=0.d0
        s=0.d0
	    do while (mjd(i).lt.t2)
	      s=s+f(i)
	      i=i+1.d0
          il=il+1.d0    ! how many points in segment
	    enddo
	    av=s/il
        day=t2-step
        if (il.ge.1.d0) then 
          write(*,3) day,av
        endif
        j=j+il
      enddo
 
    2 format(f30.7,2f30.6)
    3 format(f20.8,f8.4)
 
   10 stop
      end

executed in our script for all five part of both tick-data time-series:

58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
echo "..rebinning with dt = 1 h"
dt=0.041666667
fx_rebin $1.ask.part1 $dt > $1.ask.part1.1h
fx_rebin $1.ask.part2 $dt > $1.ask.part2.1h
fx_rebin $1.ask.part3 $dt > $1.ask.part3.1h
fx_rebin $1.ask.part4 $dt > $1.ask.part4.1h
fx_rebin $1.ask.part5 $dt > $1.ask.part5.1h
fx_rebin $1.bid.part1 $dt > $1.bid.part1.1h
fx_rebin $1.bid.part2 $dt > $1.bid.part2.1h
fx_rebin $1.bid.part3 $dt > $1.bid.part3.1h
fx_rebin $1.bid.part4 $dt > $1.bid.part4.1h
fx_rebin $1.bid.part5 $dt > $1.bid.part5.1h
 
echo "..appending rebinned files"
cat $1.ask.part1.1h $1.ask.part2.1h $1.ask.part3.1h $1.ask.part4.1h $1.ask.part5.1h >$1.ask.1h.tmp
cat $1.bid.part1.1h $1.bid.part2.1h $1.bid.part3.1h $1.bid.part4.1h $1.bid.part5.1h > $1.bid.1h.tmp
rm *part*
 
echo "..removing empty lines in rebinned files"
sed '/^$/d' $1.ask.1h.tmp > $1.ask.1h
rm $1.ask.1h.tmp
sed '/^$/d' $1.bid.1h.tmp > $1.bid.1h
rm $1.bid.1h.tmp
 
echo "..done!"

As the final product we obtain two files, say,

audusd.bid.1h
audusd.ask.1h

of the content:

$ tail -5 audusd.bid.1h 
       3772.62500304   0.9304
       3772.66666970   0.9283
       3772.70833637   0.9266
       3772.75000304   0.9263
       3772.79166970   0.9253

where the first column is time (MJD) with 1-hour resoulution and the second column contains a simple Price average from all tick prices falling into 1-hour time bin. As for dessert, we plot both Bid and Ask Price time-series what accomplishes our efforts:

AUD/USD rebinned time-series with 1-hour resolution

Contact Form Powered By : XYZScripts.com