<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>the genetic trader . com</title>
	<atom:link href="http://www.thegenetictrader.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.thegenetictrader.com</link>
	<description>A journey into using genetic programming techniques to produce profitable trading strategies.</description>
	<lastBuildDate>Sun, 17 Jan 2010 21:44:12 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Particle Swarm Optimisation</title>
		<link>http://www.thegenetictrader.com/2010/01/14/particle-swarm-optimisation/</link>
		<comments>http://www.thegenetictrader.com/2010/01/14/particle-swarm-optimisation/#comments</comments>
		<pubDate>Thu, 14 Jan 2010 23:26:40 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Haskell]]></category>
		<category><![CDATA[Optimisation]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=85</guid>
		<description><![CDATA[Several commentators have mentioned Particle Swarm Optimisation so I have researched it and coded an example below in Haskell. PSO is a search algorithm that mimics the swarming tendencies seen in nature e.g. flocks of birds, shoals of fish, etc. To visualize how it works it&#8217;s best to think of a natural example.
Imagine a single [...]]]></description>
			<content:encoded><![CDATA[<p>Several commentators have mentioned <a href="http://www.swarmintelligence.org/">Particle Swarm Optimisation</a> so I have researched it and coded an example below in Haskell. PSO is a search algorithm that mimics the swarming tendencies seen in nature e.g. flocks of birds, shoals of fish, etc. To visualize how it works it&#8217;s best to think of a natural example.</p>
<p>Imagine a single bird in search of food, as it moves about over a landscape its nose will pick up the scent of food &#8211; the closer the food the stronger the scent. It knows where it registered the strongest scent so it will concentrate it&#8217;s search around that location i.e. slowing down and moving around more closely until it finds it. One bird on it&#8217;s own will take a long time to search a large area so now let&#8217;s imagine that two birds are searching and they can communicate with each other to notify where they have a strong scent &#8211; this will induce swarming as the other bird travels towards the stronger scent.</p>
<p>This is how the algorithm works, so the x and y coordinates in the landscape are the parameters to be optimised i.e. find the value that gives the strongest scent &#8211; the location of the food ! The birds change their velocities i.e. how quickly they move over those parameters to home in on stronger values so the birds will swarm around likely solutions. The &#8217;scent&#8217; is provided by a fitness function that basically takes the parameters (x, y) and returns a score.</p>
<pre style="border: 1px dashed #999999; padding: 5px; overflow: auto; font-family: Andale Mono,Lucida Console,Monaco,fixed,monospace; color: #000000; background-color: #eeeeee; font-size: 12px; line-height: 14px; width: 100%;"><code>module Main where

import System.Random
import Data.List (find, maximum)
import Data.Maybe

c1 = 2.0
c2 = 2.0
vMax = 5.0

-- infinite list of infinite random lists (TODO: Remove fixed seed 42)
rl :: [[Double]]
rl = [randomRs ((0.0), 1.0) g | g &lt;- gl (mkStdGen 42)]
   where gl gen = gen:(gl (fst $ split gen)) 

-- Particle data type representing one particle
data Particle = Particle {
                  pbest :: [Double],           -- particle's best solution
                  pbestFitness :: Double,
                  velocities :: [Double],      -- particle's velocities
                  present :: [Double],         -- particle's present solution
                  fitness :: Double            -- current fitness
                  } deriving (Show)

cap :: Double -&gt; Double
cap x | x &lt; 0.0   = max x (-vMax)
      | otherwise = min x vMax    

-- Calculate the new velocities for a particle
calcVelocities :: Particle -&gt; [Double] -&gt; [Double] -&gt; [Double]
calcVelocities p gbest rand = [ cap (v + c1 * r1 * (pb - pr) + c2 * r2 * (gb - pr))
                                    | v &lt;- velocities p
                                    | pr &lt;- present p
                                    | pb &lt;- pbest p
                                    | gb &lt;- gbest
                                    | r1 &lt;- rand
                                    | r2 &lt;- drop (length $ velocities p) rand]

-- Update a particle's present solution and velocities
updateParticle :: Particle -&gt; [Double] -&gt; [Double] -&gt; Particle
updateParticle p gbest rands = p { present = [pr + v | pr &lt;- present p
                                                     | v &lt;- newVelocities ],
                                   velocities = newVelocities }
    where newVelocities = calcVelocities p gbest rands                                                                                                            

-- Find the best global solution in the population
getBest :: [Particle] -&gt; [Double]
getBest p = if (isJust b) then pbest (fromJust b) else []
    where  b = find (\x -&gt; pbestFitness x == m) p
           m = maximum [pbestFitness f | f &lt;- p]

-- Return the next generation
psoEvolve :: [[Double]] -&gt; [Particle] -&gt; [Particle]
psoEvolve rands population = [updateParticle p gbest r | p &lt;- pbests | r &lt;- rands]
    where fitnesses = [p { fitness = fitnessFunction (present p) } | p &lt;- population]
          pbests    = [if (fitness p &gt; pbestFitness p) then
                          p { pbest = (present p), pbestFitness = fitness p }
                       else
                          p  | p &lt;- fitnesses ]
          gbest     = getBest pbests         

-- Solve the problem using the ranges and fitness function
psoSolve :: [(Double, Double)] -&gt; Int -&gt; Int -&gt; [Double]
psoSolve ranges numParticles numGenerations = getBest $ foldl (\a x -&gt; psoEvolve (drop (2 + x) rl) a) pop [1..numGenerations]
    where pop = [Particle [] 0.0
                     (take l (head rl))
                     [(fst r) + x * ((snd r) - (fst r)) |r &lt;- ranges |x &lt;- (take l (rl !! 1))]
                     0.0 | n &lt;- [1..numParticles]]
          l   = length ranges

-- Fitness function takes the particles 'present' solution and returns a score
fitnessFunction :: [Double] -&gt; Double
fitnessFunction xs = (100.0 / sqrt (dx1 * dx1 + dy1 * dy1 + dz1 * dz1)) +
                     (50.0 / sqrt (dx2 * dx2 + dy2 * dy2 + dz2 * dz2))
    where dx1 = 53.4 - (xs !! 0)
          dy1 = 67.8 - (xs !! 1)
          dz1 = 32.6 - (xs !! 2)
          dx2 = 23.8 - (xs !! 0)
          dy2 = 18.0 - (xs !! 1)
          dz2 = 4.6 - (xs !! 2)

main = do
       print $ psoSolve [(0.0, 100.0), (0.0, 100.0), (0.0, 100.0)] 100 100 

</code></pre>
<p>The fitness function evaluates how close the particle (bird) is to two food sources in 3d space and returns the value. There is a large food source at (53.4, 67.8, 32.6) and a small food source at (23.8, 18.0, 4.6) so the optimum solution is the position of the large food source.</p>
<p>Here&#8217;s the result :-</p>
<pre style="border: 1px dashed #999999; padding: 5px; overflow: auto; font-family: Andale Mono,Lucida Console,Monaco,fixed,monospace; color: #000000; background-color: #eeeeee; font-size: 12px; line-height: 14px; width: 100%;"><code>[1 of 1] Compiling Main             ( /home/andrew/workspace/pso/src/Main.hs, interpreted )
Ok, modules loaded: Main.
*Main&gt; main
[53.422443402059535,67.83854324652401,32.637809690239244]
*Main&gt;
</code></pre>
<p>Notice that it has found the large food source with an error rate of only around 0.03 on each axis i.e. 0.03 % this is with 100 particles and 100 generations so 10,000 evaluations of the fitness function. An exhaustive search would have had to to 1000 evaluations in each axis i.e. 0.1 increments = 1,000,000,000 evaluations ! Depending on the granularity of the exhaustive search, it could have easily misidentified the smaller food source as the optimum solution which PSO avoided.</p>
<p>As you can see PSO is a very efficient search algorithm that can be several orders of magnitude faster than an exhaustive search and, as can be seen, is relatively simple to implement and use over a very wide range of problem domains.</p>
<p>So how is it going to help me in my quest to produce profitable trading strategies ? Well there are several possibilities :-</p>
<ul>
<li>Use to optimise GP solutions, i.e. use GP to create the structure of the solutions but PSO to optimise the constants used in the structure,</li>
<li>Use PSO to optimise the parameters used in the GP evolutions,</li>
<li>Use PSO to optimise the parameters to hand coded strategies.</li>
</ul>
<!-- AdSense Now! V1.77 -->
<!-- Post[count: 2] -->
<div class="adsense adsense-leadout" style="text-align:center;margin: 12px;"><script type="text/javascript"><!--
google_ad_client = "pub-4551013562266501";
/* 468x60, created 9/13/09 */
google_ad_slot = "3110238418";
google_ad_width = 468;
google_ad_height = 60;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></div>]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2010/01/14/particle-swarm-optimisation/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Lazy Evaluation</title>
		<link>http://www.thegenetictrader.com/2009/12/27/lazy-evaluation/</link>
		<comments>http://www.thegenetictrader.com/2009/12/27/lazy-evaluation/#comments</comments>
		<pubDate>Sun, 27 Dec 2009 16:39:59 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=83</guid>
		<description><![CDATA[I&#8217;ve recently been learning the Haskell programming language which has some very interesting concepts, one being lazy evaluation. Lazy evaluation defers evaluating any piece of code right until the value is actually needed rather than evaluating it in the order in which it is defined as with most programming languages. One immediate benefit of this [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently been learning the Haskell programming language which has some very interesting concepts, one being lazy evaluation. Lazy evaluation defers evaluating any piece of code right until the value is actually needed rather than evaluating it in the order in which it is defined as with most programming languages. One immediate benefit of this is that conditional execution benefits from only evaluating one side of the branch without having to explicitly code in a short circuit that most languages rely on.</p>
<p>This seemed to offer an optimisation opportunity for my genetic programming library which currently evaluates both sides of a branch and then only passes forward the selected one since I haven&#8217;t implemented any kind of short circuit.</p>
<p>When I tried to implement lazy evaluation, I got a bit of a shock because not only did I get automatic short circuiting in any conditional branch but it also removed the need to allocate and deallocate transient constant nodes to store evaluated trees when doing strict evaluation. So it made the code simpler and the end was result was that evolution runs that were taking 4 days to run now take only 6 hours !</p>
<p>This was a lovely Christmas present, an order of magnitude efficiency increase that now makes more complex models possible to calculate. I never thought that learning Haskell would have such a direct positive influence on my C code.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/12/27/lazy-evaluation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Need For Speed 2: OpenMP</title>
		<link>http://www.thegenetictrader.com/2009/11/24/the-need-for-speed-2-openmp/</link>
		<comments>http://www.thegenetictrader.com/2009/11/24/the-need-for-speed-2-openmp/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 21:59:48 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[C]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=75</guid>
		<description><![CDATA[I&#8217;ve been getting frustrated with just how long each run is taking, which is currently about a week. This means that every time that I make a small change, its several days before I know whether it&#8217;s successful or not.
I&#8217;ve been through the code looking to remove anything that is not essential and pre-compute as [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been getting frustrated with just how long each run is taking, which is currently about a week. This means that every time that I make a small change, its several days before I know whether it&#8217;s successful or not.</p>
<p>I&#8217;ve been through the code looking to remove anything that is not essential and pre-compute as much data as possible so it doesn&#8217;t need doing in the main CPU intensive evolutionary generations. I&#8217;ve reduced the maximum syntax tree depth to 6 to limit the complexity of solutions and  therefore the processing time. I&#8217;ve filtered the training data set to exclude the quiet hours overnight when I don&#8217;t intend to trade but still it is taking far too long.</p>
<p>Then I discovered OpenMP &#8211; a standard for parallel processing in C that is supported by GCC. This means that by adding a simple pragma to my code in the main evaluation loop, the code will automatically run across all the CPU cores in my machine (only 2) giving an instant doubling of performance (almost) :-</p>
<pre style="border: 1px dashed #999999; padding: 5px; overflow: auto; font-family: Andale Mono,Lucida Console,Monaco,fixed,monospace; color: #000000; background-color: #eeeeee; font-size: 12px; line-height: 14px; width: 100%;"><code> while (g &lt; num_generations)

  {
     #pragma omp parallel for private(i, p)

     for (i=0; i&lt;num_traders; i++)

     {

       p = evaluate_trader(&amp;traders[i], 0);

       traders[i].profit = p;

     }

     sort_t(traders, 0, num_traders);
</code></pre>
<p>Notice the &#8220;#pragma omp parallel for&#8221;  ?  This splits the loop into two separate loops running in parallel in two separate threads on two cores. Since this loop has 256 iterations and no dependencies from one iteration to the next this is ideal for parallelism and could be split many more ways if I had more CPU&#8217;s / cores.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/11/24/the-need-for-speed-2-openmp/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Two Profitable Strategies for GBP/USD</title>
		<link>http://www.thegenetictrader.com/2009/11/17/two-profitable-strategies-for-gbpusd/</link>
		<comments>http://www.thegenetictrader.com/2009/11/17/two-profitable-strategies-for-gbpusd/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 22:40:06 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Forex]]></category>
		<category><![CDATA[Genetic Programming]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=70</guid>
		<description><![CDATA[Well finally I&#8217;m getting some useful results as a result of adding the mini-tournaments with the integral validation stage. Here are two strategies trained on 5 minute GBP/USD dollar data for 2006 &#8211; 2008 that also show a profit on 2009 data :-

;; Candidate 1

  long entry  = (- (+ (* (min hour [...]]]></description>
			<content:encoded><![CDATA[<p>Well finally I&#8217;m getting some useful results as a result of adding the mini-tournaments with the integral validation stage. Here are two strategies trained on 5 minute GBP/USD dollar data for 2006 &#8211; 2008 that also show a profit on 2009 data :-</p>
<div class="scheme">
<pre><span class="comment">;; Candidate 1
</span>
  <span class="variable">long</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="builtin">-</span> (<span class="builtin">+</span> (<span class="builtin">*</span> (<span class="builtin">min</span> <span class="variable">hour</span> <span class="variable">bar</span>) <span class="selfeval">-30</span>) (<span class="variable">vol</span> (<span class="variable">trunc</span> <span class="variable">bid</span>))) (<span class="variable">high</span> (<span class="builtin">-</span> (<span class="builtin">abs</span> <span class="selfeval">-40</span>) (<span class="builtin">abs</span> (<span class="builtin">*</span> <span class="variable">bar</span> <span class="variable">hour</span>)))))
  <span class="variable">long</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="builtin">-</span> (<span class="variable">close</span> (<span class="builtin">min</span> (<span class="builtin">abs</span> (<span class="builtin">+</span> <span class="variable">bar</span> <span class="variable">minute</span>)) (<span class="builtin">*</span> (<span class="builtin">abs</span> (<span class="builtin">abs</span> <span class="selfeval">-2</span>)) (<span class="keyword">if</span> (<span class="builtin">&lt;</span> (<span class="variable">trunc</span> <span class="variable">bid</span>) (<span class="builtin">min</span> <span class="selfeval">-24</span> <span class="variable">bar</span>)) (<span class="variable">trunc</span> (<span class="variable">low</span> <span class="variable">hour</span>)) (<span class="builtin">+</span> (<span class="variable">vol</span> <span class="selfeval">-7</span>) <span class="variable">hour</span>))))) (<span class="variable">min120</span> (<span class="builtin">min</span> (<span class="builtin">*</span> (<span class="variable">vol</span> (<span class="variable">vol</span> (<span class="variable">vol</span> <span class="selfeval">22</span>))) (<span class="builtin">min</span> (<span class="builtin">min</span> <span class="variable">bar</span> (<span class="builtin">abs</span> (<span class="builtin">*</span> <span class="variable">minute</span> <span class="variable">hour</span>))) (<span class="variable">trunc</span> <span class="variable">bid</span>))) (<span class="builtin">-</span> <span class="variable">bar</span> (<span class="variable">trunc</span> <span class="selfeval">0.137174</span>)))))
  <span class="variable">short</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="builtin">-</span> (<span class="variable">stoch</span> (<span class="builtin">min</span> (<span class="builtin">max</span> (<span class="builtin">max</span> (<span class="variable">vol</span> <span class="variable">minute</span>) (<span class="variable">vol</span> <span class="selfeval">15</span>)) <span class="selfeval">37</span>) <span class="selfeval">47</span>)) (<span class="builtin">abs</span> (<span class="builtin">max</span> (<span class="variable">trunc</span> (<span class="variable">stoch</span> (<span class="builtin">max</span> <span class="variable">bar</span> (<span class="variable">vol</span> <span class="selfeval">23</span>)))) <span class="variable">bar</span>)))
  <span class="variable">short</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="builtin">+</span> (<span class="builtin">abs</span> (<span class="keyword">if</span> (<span class="keyword">and</span> (<span class="builtin">&lt;</span> <span class="selfeval">0.702188</span> <span class="variable">bid</span>) (<span class="builtin">&gt;</span> (<span class="builtin">+</span> (<span class="builtin">abs</span> <span class="selfeval">-39</span>) (<span class="builtin">max</span> <span class="selfeval">24</span> <span class="variable">hour</span>)) (<span class="variable">trunc</span> (<span class="variable">ema60</span> (<span class="builtin">*</span> <span class="selfeval">-20</span> <span class="selfeval">26</span>))))) (<span class="builtin">+</span> (<span class="variable">trunc</span> <span class="selfeval">0.403593</span>) (<span class="variable">trunc</span> (<span class="variable">emalow5</span> <span class="variable">hour</span>))) (<span class="builtin">max</span> <span class="variable">minute</span> (<span class="builtin">+</span> <span class="variable">bar</span> <span class="variable">bar</span>)))) (<span class="builtin">-</span> (<span class="builtin">min</span> (<span class="variable">trunc</span> <span class="variable">ask</span>) (<span class="keyword">if</span> (<span class="keyword">and</span> <span class="variable">FALSE</span> (<span class="builtin">=</span> (<span class="builtin">-</span> <span class="variable">hour</span> <span class="selfeval">-32</span>) <span class="variable">minute</span>)) <span class="selfeval">46</span> <span class="selfeval">35</span>)) (<span class="builtin">*</span> <span class="selfeval">33</span> (<span class="keyword">if</span> <span class="variable">FALSE</span> <span class="selfeval">0.482923</span> <span class="variable">ask</span>))))

<span class="comment">;; Candidate 2
</span>  <span class="variable">long</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="variable">double</span> (<span class="builtin">max</span> (<span class="variable">vol</span> (<span class="builtin">max</span> <span class="variable">minute</span> <span class="selfeval">20</span>)) <span class="variable">bar</span>))
  <span class="variable">long</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="variable">emahigh10</span> (<span class="builtin">max</span> (<span class="variable">trunc</span> <span class="selfeval">0.859856</span>) <span class="variable">bar</span>))
  <span class="variable">short</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="variable">stoch</span> (<span class="builtin">+</span> (<span class="variable">trunc</span> (<span class="builtin">-</span> (<span class="builtin">*</span> <span class="selfeval">0.498268</span> (<span class="variable">stoch</span> <span class="variable">minute</span>)) (<span class="builtin">max</span> <span class="variable">bar</span> (<span class="variable">vol</span> (<span class="builtin">+</span> <span class="variable">hour</span> <span class="selfeval">22</span>))))) (<span class="keyword">if</span> (<span class="builtin">&lt;</span> (<span class="builtin">*</span> <span class="variable">bar</span> (<span class="builtin">-</span> <span class="selfeval">-30</span> <span class="variable">minute</span>)) (<span class="builtin">*</span> (<span class="builtin">*</span> (<span class="builtin">*</span> <span class="selfeval">22</span> <span class="variable">bar</span>) (<span class="variable">trunc</span> <span class="selfeval">0.041836</span>)) <span class="selfeval">48</span>)) (<span class="builtin">*</span> <span class="selfeval">-17</span> <span class="selfeval">54</span>) (<span class="variable">trunc</span> (<span class="builtin">*</span> <span class="variable">bar</span> (<span class="variable">max120</span> (<span class="variable">vol</span> <span class="selfeval">32</span>)))))))
  <span class="variable">short</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="builtin">*</span> (<span class="keyword">if</span> (<span class="builtin">=</span> (<span class="variable">vol</span> <span class="selfeval">-27</span>) <span class="variable">hour</span>) (<span class="builtin">max</span> (<span class="builtin">max</span> (<span class="keyword">if</span> (<span class="builtin">&gt;</span> <span class="variable">minute</span> (<span class="variable">vol</span> <span class="selfeval">19</span>)) <span class="selfeval">36</span> <span class="selfeval">-6</span>) (<span class="builtin">+</span> <span class="variable">hour</span> (<span class="builtin">min</span> <span class="variable">bar</span> (<span class="builtin">-</span> <span class="selfeval">-9</span> <span class="variable">hour</span>)))) (<span class="builtin">max</span> (<span class="variable">vol</span> <span class="selfeval">20</span>) (<span class="variable">trunc</span> (<span class="variable">ema20</span> (<span class="builtin">abs</span> <span class="selfeval">16</span>))))) <span class="variable">bar</span>) (<span class="builtin">sin</span> <span class="selfeval">0.015177</span>))</pre>
</div>
<p>The profit is very modest for both at around 1000 pips for the year to the end of September.</p>
<p>Each algorithm (long entry, long exit, short entry and short exit) are used as signals to enter or exit long or short. If the algorithm returns a double value greater than zero it is used as a signal to proceed.</p>
<p>Looking at the 2nd candidate, it is essentially a shorting strategy based on the stochastic oscillator because long entry and long exit will always evaluate to a positive value meaning that any long trades entered (i.e. in the absence of a established short trade) will immediately be exited. This suggests that considerable improvement could be achieved by manually removing the long trading aspect.</p>
<p>It is fairly typical to get nonsense code in the results i.e. conditional code that always evaluates to one particular value because of the nature of the underlying data e.g. checking that a close value is greater than zero which clearly it always will be. I&#8217;ve been hand adjusting the winning algorithms before reusing them and I have an automated simplification stage in the processing that looks to remove pointless code where it can. Perhaps it could be reduced by reducing the maximum depth of algorithms. Currently maximum depth is set to 8 which means that you can have a maximum of 8 levels of nesting in the algorithm before shrink mutation is enforced. Perhaps reducing this to say 5 might reduce code bloat without compromising the solution quality &#8211; it would certainly speed up processing which is proving an issue again at the moment.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/11/17/two-profitable-strategies-for-gbpusd/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Random Killings</title>
		<link>http://www.thegenetictrader.com/2009/11/11/random-killings/</link>
		<comments>http://www.thegenetictrader.com/2009/11/11/random-killings/#comments</comments>
		<pubDate>Wed, 11 Nov 2009 10:12:48 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[C]]></category>
		<category><![CDATA[Genetic Programming]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=68</guid>
		<description><![CDATA[I&#8217;ve had an interesting problem since I started my mini-tournaments with validation on unseen data in that small runs completed OK but larger ones have either locked up the computer (unresponsive with the hard disk thrashing) or the process had been killed.
At first I suspected that my laptop had been hacked into and the hacker [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve had an interesting problem since I started my mini-tournaments with validation on unseen data in that small runs completed OK but larger ones have either locked up the computer (unresponsive with the hard disk thrashing) or the process had been killed.</p>
<p>At first I suspected that my laptop had been hacked into and the hacker had killed my GP process because it was burning too much CPU and interfering with his spam mailing process. So I disconnected it from the internet and tried again but got the same results.</p>
<p>When I checked the /var/log/messages it turns out the process had been killed by &#8216;oom_killer&#8217; which is a kernel process. It was killed for the simple reason that it had stolen all the system memory and was still demanding more i.e. it had a severe memory leak !</p>
<p>After doing some profiling with valgrind this turned out to be a missing free from the crossover function in my ast library. With this fixed, the runs continue.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/11/11/random-killings/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Initial results are disappointing</title>
		<link>http://www.thegenetictrader.com/2009/11/05/initial-results-are-disappointing/</link>
		<comments>http://www.thegenetictrader.com/2009/11/05/initial-results-are-disappointing/#comments</comments>
		<pubDate>Thu, 05 Nov 2009 19:18:16 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Forex]]></category>
		<category><![CDATA[Genetic Programming]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=63</guid>
		<description><![CDATA[Well after several weeks of running I ended up with two contender algorithms :-

Generation 511: Best = 4962.700000
  long entry  = (- (+ (* (min hour bar) -30) (vol (trunc bid))) (high (- (abs -40) (abs (* bar hour)))))
  long exit  = (- (close (min (abs (+ bar minute)) (* (abs [...]]]></description>
			<content:encoded><![CDATA[<p>Well after several weeks of running I ended up with two contender algorithms :-</p>
<div class="scheme">
<pre><span class="variable">Generation</span> <span class="variable">511:</span> <span class="variable">Best</span> <span class="builtin">=</span> <span class="selfeval">4962.700000</span>
  <span class="variable">long</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="builtin">-</span> (<span class="builtin">+</span> (<span class="builtin">*</span> (<span class="builtin">min</span> <span class="variable">hour</span> <span class="variable">bar</span>) <span class="selfeval">-30</span>) (<span class="variable">vol</span> (<span class="variable">trunc</span> <span class="variable">bid</span>))) (<span class="variable">high</span> (<span class="builtin">-</span> (<span class="builtin">abs</span> <span class="selfeval">-40</span>) (<span class="builtin">abs</span> (<span class="builtin">*</span> <span class="variable">bar</span> <span class="variable">hour</span>)))))
  <span class="variable">long</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="builtin">-</span> (<span class="variable">close</span> (<span class="builtin">min</span> (<span class="builtin">abs</span> (<span class="builtin">+</span> <span class="variable">bar</span> <span class="variable">minute</span>)) (<span class="builtin">*</span> (<span class="builtin">abs</span> (<span class="builtin">abs</span> <span class="selfeval">-2</span>)) (<span class="keyword">if</span> (<span class="builtin">&lt;</span> (<span class="variable">trunc</span> <span class="variable">bid</span>) (<span class="builtin">min</span> <span class="selfeval">-24</span> <span class="variable">bar</span>)) (<span class="variable">trunc</span> (<span class="variable">low</span> <span class="variable">hour</span>)) (<span class="builtin">+</span> (<span class="variable">vol</span> <span class="selfeval">-7</span>) <span class="variable">hour</span>))))) (<span class="variable">min120</span> (<span class="builtin">min</span> (<span class="builtin">*</span> (<span class="variable">vol</span> (<span class="variable">vol</span> (<span class="variable">vol</span> <span class="selfeval">22</span>))) (<span class="builtin">min</span> (<span class="builtin">min</span> <span class="variable">bar</span> (<span class="builtin">abs</span> (<span class="builtin">*</span> <span class="variable">minute</span> <span class="variable">hour</span>))) (<span class="variable">trunc</span> <span class="variable">bid</span>))) (<span class="builtin">-</span> <span class="variable">bar</span> (<span class="variable">trunc</span> <span class="selfeval">0.137174</span>)))))
  <span class="variable">short</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="builtin">-</span> (<span class="variable">stoch</span> (<span class="builtin">min</span> (<span class="builtin">max</span> (<span class="builtin">max</span> (<span class="variable">vol</span> <span class="variable">minute</span>) (<span class="variable">vol</span> <span class="selfeval">15</span>)) <span class="selfeval">37</span>) <span class="selfeval">47</span>)) (<span class="builtin">abs</span> (<span class="builtin">max</span> (<span class="variable">trunc</span> (<span class="variable">stoch</span> (<span class="builtin">max</span> <span class="variable">bar</span> (<span class="variable">vol</span> <span class="selfeval">23</span>)))) <span class="variable">bar</span>)))
  <span class="variable">short</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="builtin">+</span> (<span class="builtin">abs</span> (<span class="keyword">if</span> (<span class="keyword">and</span> (<span class="builtin">&lt;</span> <span class="selfeval">0.702188</span> <span class="variable">bid</span>) (<span class="builtin">&gt;</span> (<span class="builtin">+</span> (<span class="builtin">abs</span> <span class="selfeval">-39</span>) (<span class="builtin">max</span> <span class="selfeval">24</span> <span class="variable">hour</span>)) (<span class="variable">trunc</span> (<span class="variable">ema60</span> (<span class="builtin">*</span> <span class="selfeval">-20</span> <span class="selfeval">26</span>))))) (<span class="builtin">+</span> (<span class="variable">trunc</span> <span class="selfeval">0.403593</span>) (<span class="variable">trunc</span> (<span class="variable">emalow5</span> <span class="variable">hour</span>))) (<span class="builtin">max</span> <span class="variable">minute</span> (<span class="builtin">+</span> <span class="variable">bar</span> <span class="variable">bar</span>)))) (<span class="builtin">-</span> (<span class="builtin">min</span> (<span class="variable">trunc</span> <span class="variable">ask</span>) (<span class="keyword">if</span> (<span class="keyword">and</span> <span class="variable">FALSE</span> (<span class="builtin">=</span> (<span class="builtin">-</span> <span class="variable">hour</span> <span class="selfeval">-32</span>) <span class="variable">minute</span>)) <span class="selfeval">46</span> <span class="selfeval">35</span>)) (<span class="builtin">*</span> <span class="selfeval">33</span> (<span class="keyword">if</span> <span class="variable">FALSE</span> <span class="selfeval">0.482923</span> <span class="variable">ask</span>))))

<span class="variable">Generation</span> <span class="variable">389:</span> <span class="variable">Best</span> <span class="builtin">=</span> <span class="selfeval">3146.500000</span>
  <span class="variable">long</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="builtin">*</span> (<span class="variable">high</span> (<span class="builtin">+</span> <span class="variable">bar</span> <span class="selfeval">-17</span>)) (<span class="builtin">/</span> (<span class="variable">lower</span> (<span class="builtin">min</span> (<span class="variable">trunc</span> (<span class="builtin">/</span> <span class="variable">minute</span> (<span class="variable">min120</span> <span class="selfeval">-35</span>))) (<span class="variable">trunc</span> (<span class="builtin">*</span> (<span class="builtin">*</span> <span class="selfeval">0.001833</span> (<span class="builtin">sin</span> <span class="selfeval">0.789917</span>)) <span class="selfeval">18</span>)))) (<span class="builtin">/</span> (<span class="builtin">min</span> (<span class="variable">trunc</span> (<span class="variable">min120</span> (<span class="builtin">+</span> <span class="variable">hour</span> (<span class="builtin">+</span> <span class="variable">hour</span> <span class="selfeval">16</span>)))) (<span class="variable">vol</span> <span class="selfeval">33</span>)) (<span class="builtin">abs</span> (<span class="variable">trunc</span> (<span class="builtin">-</span> (<span class="variable">max120</span> <span class="selfeval">19</span>) <span class="selfeval">0.677009</span>))))))
  <span class="variable">long</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="variable">ema5</span> (<span class="variable">trunc</span> (<span class="variable">emalow5</span> (<span class="builtin">max</span> <span class="selfeval">20</span> <span class="selfeval">30</span>))))
  <span class="variable">short</span> <span class="variable">entry</span>  <span class="builtin">=</span> (<span class="variable">upper</span> (<span class="keyword">if</span> (<span class="builtin">&lt;</span> (<span class="builtin">min</span> (<span class="builtin">abs</span> (<span class="builtin">*</span> <span class="variable">hour</span> <span class="variable">bar</span>)) (<span class="builtin">max</span> <span class="variable">hour</span> <span class="variable">bar</span>)) (<span class="builtin">max</span> <span class="variable">minute</span> (<span class="builtin">max</span> <span class="selfeval">-9</span> (<span class="builtin">+</span> (<span class="builtin">abs</span> <span class="selfeval">17</span>) <span class="variable">bar</span>)))) (<span class="keyword">if</span> (<span class="keyword">and</span> (<span class="builtin">&gt;</span> (<span class="variable">emalow10</span> (<span class="builtin">*</span> <span class="variable">hour</span> <span class="variable">bar</span>)) <span class="selfeval">0.000323</span>) (<span class="builtin">&lt;</span> (<span class="variable">double</span> <span class="variable">minute</span>) (<span class="builtin">-</span> <span class="selfeval">0.938433</span> <span class="selfeval">0.677536</span>))) (<span class="builtin">max</span> (<span class="builtin">+</span> <span class="selfeval">25</span> <span class="selfeval">-24</span>) (<span class="builtin">abs</span> <span class="variable">minute</span>)) <span class="variable">bar</span>) (<span class="builtin">abs</span> (<span class="builtin">min</span> (<span class="variable">vol</span> (<span class="builtin">-</span> <span class="selfeval">-25</span> <span class="selfeval">39</span>)) (<span class="builtin">max</span> <span class="variable">bar</span> <span class="selfeval">6</span>)))))
  <span class="variable">short</span> <span class="builtin">exit</span>  <span class="builtin">=</span> (<span class="variable">emalow5</span> (<span class="keyword">if</span> (<span class="keyword">and</span> (<span class="builtin">&gt;</span> (<span class="variable">min120</span> <span class="variable">hour</span>) <span class="selfeval">0.121671</span>) (<span class="builtin">&gt;</span> (<span class="builtin">*</span> (<span class="builtin">-</span> <span class="selfeval">50</span> <span class="variable">hour</span>) (<span class="builtin">abs</span> (<span class="builtin">+</span> <span class="selfeval">40</span> <span class="selfeval">13</span>))) (<span class="variable">trunc</span> <span class="variable">ask</span>))) (<span class="builtin">+</span> <span class="variable">bar</span> <span class="variable">minute</span>) <span class="variable">bar</span>))</pre>
</div>
<p>
As you can see, each solution has four algorithms long_entry, long_exit, short_entry and short_exit. Each of these returns a double which if greater than 0.0 is a signal to enter or exit the trade.</p>
<p>The &#8216;Best&#8217; value is the number of pips profit over the training period which is three years so not massive. More disappointing when I tried each solution on two years data previously unseen, the first made a loss and the second made a very modest profit of around 1000 pips &#8211; a value that I would consider is not statistically significant and therefore not conclusive of success.</p>
<p>Uninspired by these results, I have modified the program to continue the mini-tournaments but after each it takes the winning algorithm and tests it on unseen data. If the algorithm shows a profit on the unseen validation data then it is carried forward otherwise it is discarded there and then. Hopefully this will ensure that all of the final competing algorithms have some potential to show repeatable profits.</p>
<p>At this stage my expectation of success is probably slightly less than 50%.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/11/05/initial-results-are-disappointing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Added &#8216;Shrink&#8217; Mutation and mini-Tournaments</title>
		<link>http://www.thegenetictrader.com/2009/09/30/added-shrink-mutation-and-mini-tournaments/</link>
		<comments>http://www.thegenetictrader.com/2009/09/30/added-shrink-mutation-and-mini-tournaments/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 10:21:38 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Forex]]></category>
		<category><![CDATA[Genetic Programming]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=61</guid>
		<description><![CDATA[Although the run was 2 days in, I had noticed that the profitability was not increasing particularly from a level of around 11,000 pips (over 21 months)  and that the programs were becoming bloated which was slowing the generations.
Even though I had reinstated the simplification process, the mutation scheme generally encourages increasing complexity over time [...]]]></description>
			<content:encoded><![CDATA[<p>Although the run was 2 days in, I had noticed that the profitability was not increasing particularly from a level of around 11,000 pips (over 21 months)  and that the programs were becoming bloated which was slowing the generations.</p>
<p>Even though I had reinstated the simplification process, the mutation scheme generally encourages increasing complexity over time because the mutation operators either keep the nodes the same size or increase them by replacing them with a more complex alternative. Unchecked this could lead to overfitting because the program will get more and more complex until it is ultimately a lookup table for the training data !</p>
<p>Another thing that I noticed was that the solutions weren&#8217;t particularly inspiring and relied pretty much on a buy and hold type strategy at the right levels &#8211; which would be pretty useless in any other data set since the levels would be different.</p>
<p>I took the decision to terminate the run early and make some changes :-</p>
<ol>
<li>Introduce &#8217;shrink&#8217; mutation, this basically makes solutions smaller by replacing nodes with smaller alternatives. The method I used was &#8216;hoist&#8217; mutation where you find a sub-node with the same type as the node you are replacing and hoist it up to replace the current node. This reduces the levels of nesting by one and also sheds all the adjacent branches. Additionally, I now monitor the overall depth of each program and, if it reaches a maximum threshold, growth mutation is prevented and only shrink or no change is permitted. Hopefully this restriction will encourage general solutions rather than over-complex and overfitted solutions. It also keeps the generational time down.</li>
<li>It occurred to me that perhaps I&#8217;m expecting too much to generate a decent strategy like trend following, retracements or breakouts by presenting such a huge mass of data. Maybe it&#8217;s not surprising that we end up with simple level based strategies that are naturally limited in their profit potential. To get around this I decided to slice the data into 100 separate slices of 1,000 rows each (around 4 days) and run an evolution on each slice. Then for the main evolution, I take the 100 separate winners and merge them with random programs to make up the  inital population for the complete data set. The theory here is that with less data, these individual evolutions should be able to come up with more specific strategies that might work on the data set as a whole i.e. learn small, try big.  Looking at the results from the individual slices, their average profitability was in the 400 &#8211; 600 pips range suggesting that on the complete data set an overall profitability of around 50,000 pips should be possible if any of the strategies can be translated onto the full data set.</li>
</ol>
<p>So now I have a full run going that was seeded by the individual winners from the 100 slices and with maximum depth size limited to 8 levels.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/09/30/added-shrink-mutation-and-mini-tournaments/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Run One Underway</title>
		<link>http://www.thegenetictrader.com/2009/09/27/run-one-underway/</link>
		<comments>http://www.thegenetictrader.com/2009/09/27/run-one-underway/#comments</comments>
		<pubDate>Sun, 27 Sep 2009 20:51:20 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Forex]]></category>
		<category><![CDATA[Genetic Programming]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=58</guid>
		<description><![CDATA[I tracked down the last remaining bug in the simplification process and did a mini run for 2 days which failed to produce any more bugs and produced a modestly profitable strategy even without stop losses and the full set of indicators.
I had previously disabled the simplification process as it contained a bug and thought [...]]]></description>
			<content:encoded><![CDATA[<p>I tracked down the last remaining bug in the simplification process and did a mini run for 2 days which failed to produce any more bugs and produced a modestly profitable strategy even without stop losses and the full set of indicators.</p>
<p>I had previously disabled the simplification process as it contained a bug and thought maybe it was unnecessary. The process searches the syntax tree and removes any nodes that are a waste of time e.g. (+ 5.0 2.3) and replaces them with the result e.g. 7.3. During the mini run, I noticed that the generations were getting slower and slower because the syntax trees were getting more and more complex and often needlessly. Because of this, I reinstated the process which forced me to fix the bug.</p>
<p>So with this fixed and stop losses now implemented, the first serious run is now underway and should finish within 5 days. The terminal set now consists of :-</p>
<p>variable &#8220;bid&#8221; &#8211; current bid price<br />
variable &#8220;ask&#8221; &#8211; current ask price<br />
variable &#8220;bar&#8221; &#8211; current bar count<br />
variable &#8220;hour&#8221; &#8211; current hour<br />
variable &#8220;minute&#8221; &#8211; current minute<br />
function &#8220;open&#8221; &#8211; candle open time series access e.g. (open 1) last open, (open 2) last but one etc.<br />
function &#8220;close&#8221; &#8211; candle close time series access<br />
function &#8220;low&#8221; &#8211; candle low time series access<br />
function &#8220;high&#8221; &#8211; candle high time series access<br />
function &#8220;vol&#8221; &#8211; volume time series access<br />
function &#8220;sma5&#8243; &#8211; 5 period simple moving average of close price<br />
function &#8220;sma10&#8243; &#8211; 10 period simple moving average<br />
function &#8220;sma20&#8243; &#8211; 20 period simple moving average<br />
function &#8220;sma30&#8243; &#8211; 30 period simple moving average<br />
function &#8220;sma60&#8243; &#8211; 60 period simple moving average<br />
function &#8220;ema5&#8243; &#8211; 5 period exponential moving average of close price<br />
function &#8220;ema10&#8243; &#8211; 10 period exponential moving average<br />
function &#8220;ema20&#8243; &#8211; 20 period exponential moving average<br />
function &#8220;ema30&#8243; &#8211; 30 period exponential moving average<br />
function &#8220;ema60&#8243; &#8211; 60 period exponential moving average<br />
function &#8220;emahigh5&#8243; &#8211; 5 period EMA of high price<br />
function &#8220;emahigh10&#8243; &#8211; 10 period EMA of high price<br />
function &#8220;emalow5&#8243; &#8211; 5 period EMA of low price<br />
function &#8220;emalow10&#8243; &#8211; 10 period EMA of low price<br />
function &#8220;max60&#8243; &#8211; highest price in last 60 periods (possible resistance level)<br />
function &#8220;max120&#8243; &#8211; highest price in last 120 periods (possible resistance level)<br />
function &#8220;min60&#8243;- lowest price in last 60 periods (possible support level)<br />
function &#8220;min120&#8243; &#8211; lowest price in last 120 periods (possible support level)<br />
function &#8220;upper&#8221; &#8211; upper Bollinger band<br />
function &#8220;lower&#8221; &#8211; lower Bollinger band</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/09/27/run-one-underway/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Two Steps Forward, One Back</title>
		<link>http://www.thegenetictrader.com/2009/09/25/two-steps-forward-one-back/</link>
		<comments>http://www.thegenetictrader.com/2009/09/25/two-steps-forward-one-back/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 08:31:04 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[Forex]]></category>
		<category><![CDATA[Genetic Programming]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=54</guid>
		<description><![CDATA[Last night I tracked down the segmentation violation that was causing my programs to bomb out &#8211; it was an error in the simplification process. I&#8217;ve been having second thoughts on simplification so I&#8217;ve disabled it for the time being &#8211; more on this in a later post.
I also enriched the terminal set to include [...]]]></description>
			<content:encoded><![CDATA[<p>Last night I tracked down the segmentation violation that was causing my programs to bomb out &#8211; it was an error in the simplification process. I&#8217;ve been having second thoughts on simplification so I&#8217;ve disabled it for the time being &#8211; more on this in a later post.</p>
<p>I also enriched the terminal set to include a 20 period simple moving average (SMA) and upper and lower Bollinger Bands. I also spotted a feature where very risk averse algorithms i.e. those that never made any trades were doing rather well because their profit was £0 which is very good compared with the horrendous losses of the competitors which actually did trade. To discourage this risk averse behaviour, I changed the fitness function to penalise any algorithm that didn&#8217;t make any trades by giving it a profit of -£100,000.</p>
<p>With this I set a small run going over night with a small population of 128 and 128 generations to flush out any further errors.</p>
<p>The run was successful in that it flushed out a further error related to the type system &#8211; a mutation had created a function with the wrong type input. This should never happen so there is another bug in there somewhere. This error terminated the run.</p>
<p>The good news is that the run had completed 41 generations before the error occurred and produced a profitable algorithm ! It was a modest profit of only £10,000 (over two years and trading with 0.1 lots) but it was a very encouraging start. I would hope that with a larger population and longer runs that we should get a profit of around 10 times this.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/09/25/two-steps-forward-one-back/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bloody C Code</title>
		<link>http://www.thegenetictrader.com/2009/09/24/bloody-c-code/</link>
		<comments>http://www.thegenetictrader.com/2009/09/24/bloody-c-code/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 08:26:32 +0000</pubDate>
		<dc:creator>andrew</dc:creator>
				<category><![CDATA[C]]></category>
		<category><![CDATA[Genetic Programming]]></category>
		<category><![CDATA[Scheme]]></category>

		<guid isPermaLink="false">http://www.thegenetictrader.com/?p=48</guid>
		<description><![CDATA[It&#8217;s been a while since I&#8217;ve done any serious work in C so I&#8217;d forgotten about some of the disadvantages: segmentation violations.
The good news is that I&#8217;ve now implemented everything that I need in C to start my first run and performance is OK &#8211; I&#8217;ve estimated that each generation will take about 2 hours [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been a while since I&#8217;ve done any serious work in C so I&#8217;d forgotten about some of the disadvantages: segmentation violations.</p>
<p>The good news is that I&#8217;ve now implemented everything that I need in C to start my first run and performance is OK &#8211; I&#8217;ve estimated that each generation will take about 2 hours to process which isn&#8217;t bad for the estimated 2.4 billion invocations of the genetic programs per generation. So doing 100 generations will take around 4 days processing which is reasonable. Loading and pre-processing the large historic data file (600,000 rows) takes around 6 seconds compared with over 40 minutes for the Scheme code.</p>
<p>Now for the bad news &#8211; I&#8217;m getting random segmentation violations which is killing my program stone dead. I&#8217;ve been through the code many times over (albeit when I&#8217;m tired) checking and double checking everything but can&#8217;t see anything.  The first generation completes successfully so it looks like the memory corruption occurs during the mutation or simplification process.</p>
<p>Annoyingly, it doesn&#8217;t seg fault then as I&#8217;d be able to find this with the debugger, no it creates a problem that causes the seg fault later so the debuuger gives you no clues as to what caused the corruption.</p>
<p>Also annoyingly, it must be an extremely rare event because if I run with small populations and few generations there&#8217;s a good chance I won&#8217;t get a seg fault.</p>
<p>Now I long for Scheme and the pure functional approach to programming that makes this kind of problem impossible.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.thegenetictrader.com/2009/09/24/bloody-c-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
