For the past few months I've been trying to solve the word problems on the Project Euler (pronounced Oiler, so I found out today!), but these problems will truly prep you for thinking outside the box. Problem #3 required you to find the largest prime factor of a composite number (600,851,475,143).
The Breakdown
So let's break it down. What are they asking here?
First off these guys are trying to screw with you, but all they are really asking is what prime numbers are evenly divisible by the main composite number. (From the site: ) The prime factors of 13195 are 5, 7, 13 and 29. 13195 is evenly divisible by 5,7,13,29 which results in following (after running through my algorithm).
1 Wave (TEST_NUMBER: 13195
RESULTS SET: [2639.0, 1885.0, 1015.0, 455.0]
Prime Factors: [5, 7, 13, 29]
What is really cool about prime factors of a composite number? Well if you multiply each prime factor by themselves (eg. 5*7*13*29) you get the composite number of 13195.
So if we can solve for 13195, why not solve for a number over 600 billion? This was the challenge. I didn't know how many prime numbers I would need to include in my @primes Array, so I did what anyone would do and decided to find a number that when squared would be around 600 billion, my calculations were off, but that didn't matter, I chose to grab all the prime numbers up to 775,000. (600,851,475,143 / 775,000) = 775,292.226 - close enough.
Calculating the Primes
So I took different approach to collecting my data for this equation. Sure i started off looking online to try and find a nice csv file with all the prime numbers I wanted, but unfortunately most I could find were up to 1000, and that isn't nearly enough. So I wrote the code to find the prime numbers.
gather_prime_numbers.rb source code
important piece of the puzzle (snippet)
def check_for_prime(current,total)
divisible_by_self = false
divisible_by_one = false
divisible_by_none_other = true
for j in (1..total) do
if (current.to_f % j.to_f) == 0
if j == 1
divisible_by_one = true
elsif j == current && current != 1
divisible_by_self = true
else
divisible_by_none_other = false
end
end
end
# prime numbers are only divisible by 1 and themselves, so @divisible_by_none_other.should == true
if divisible_by_self == true && divisible_by_one == true && divisible_by_none_other == true
#@primes << current
return current
else
return nil
end
end
So what our little algorithm is doing is this, it is testing that each number in our set is only divisible by itself and one, and returning the number as a prime number or as a nil.
So where does this get us? If you have nothing but time on your hands and a great solid state macpro with 8GB RAM (I'm smiling cause the company gave me just such a computer) then have at it. Run the following code from the command line using the gather_prime_numbers.rb code:
> ruby gather_prime_numbers.rb 0 200
--------------------------------------------
Start 0 - LIMIT 200
Current: 100 Primes: 0
PRIMES: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
Benchmark: 0.003527 Ran 1 Times
Current: 200
Primes: 25
PRIMES: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199]
Benchmark: 0.006756
Ran 2 Times
--------------------------------------------
Now for fun. Let's raise the stakes
> ruby gather_prime_numbers.rb 0 45000
(* note to self: You will reach 99% CPU overhead, vs a fair 4% or less of RAM consumption)
When the application starts to cool down you'll be right around the 450 run times block.
Benchmarking Data
Benchmark: 1.495103 (s) (Current Completed cycle of 1000 numbers tested, filtered down to just the primes)
Operation Time: 341.3632109999999 (s) (total time since the algorithm started running)
Ran 449 Times (number of cycles of 1000 numbers checked to find the presence of prime numbers)
Current: 45000 (last number we ended on)
Primes: 4665 (total primes in the set from 0 to 45000
Note: As the primes get larger the time to calculate goes up parabolically. The number of calculations per prime number test is n*n-1 (ex: find prime number of 4500, it takes 20,245,500 calculations (4500*4499)). So it is truly incredible when you think about the numbers up towards the 750,000+. (750,000 * 749,999 = 562,499,250,000 calculations per test), since we are running sets of 1000, we end up churning out 5.62e14 calculations per set of 1000. So after a million, you'd want to be running parallel sets, or using a compiled language.
What are we getting at?
Over the course of a weekend. I let this algorithm just trudge along and after 3 days of my macpro churning at 99% CPU, I had finally gotten the data set I needed, that being the prime numbers up to 750,000. (Note* 750,000 was easier for me since in theory since in the first example (composite = 13195, we only needed to go up to 29), so the rest of the numbers leading up to the full 775,000, would have only really provided an extra 25,000 numbers to test for primes for this main set. My logic here is that there was an average of around 60 primes per set of 1000, so we are looking at only around 1,500 of so lost prime candidates.
Lucky for you guys, I did all the heavy lifting and now you can just take it from here.
Prime Numbers up to 750,000 in CSV format
Well we've gone on over and broken down the word problem into its simple parts. We know that we need 1.) A set of prime numbers as potential prime factors for our target composite number. 2.) We also know that in order to have a set of prime numbers we must have data to work with (which we have now)
Solution
solution files on github
(Spoiler Alert) Solution to Project Euler # 3
#!/usr/bin/ruby
# PROBLEM 3 - PROJECT EULER
# grab csv file into Array chunks
File.open(File.dirname(__FILE__)+"/primes_to_750000.csv") do |f|
@f = f.readlines
end
# strip white spaces
@primes = @f[0].split(',')
@primes.collect! do |n|
n.gsub(/ /,'').to_i
end
#puts @primes.inspect
@count = 0
# grab results and return to calling method
def run_recursive_loop &block
results = yield
return results
end
# main logic, tests for prime factors
def divide_and_find_prime_results(start_number,info_set)
@next_results = []
@factors = []
# try to divide start_number by all members of info_set
info_set.each {|prime|
@tmp_result = (start_number.to_f/prime.to_f)
#puts "START NUMBER: #{start_number.to_s} / PRIME: #{prime.to_s} #{@tmp_result.to_s}"
if (@tmp_result % 2) == 1
#puts "FINALLY: #{prime.to_s}"
@next_results << @tmp_result
@factors << prime
end
}
@count += 1
return {"set"=>@next_results,"prime_factors"=>@factors}
end
# used to display information on each set of tests against the target composite number
def run_til_none_left_to_run(test_number, start_set)
r = run_recursive_loop { divide_and_find_prime_results(test_number,start_set) }
puts "#{@count.to_s} Wave TEST_NUMBER: #{test_number.to_s}\nRESULTS SET: #{r['set'].inspect}\nPrime Factors: #{r['prime_factors'].inspect}"
end
@cap = 600851475143 #13195
def benchmark(&block)
start_benchmark = Time.now
yield
end_benchmark = Time.now
t = end_benchmark - start_benchmark
puts "TOTAL TIME: #{t.to_s}"
end
benchmark{ run_til_none_left_to_run(@cap,@primes) }
So at this point you are seconds away from figuring out the actual meaning behind project oilers (eulers) problem. > prime_factors_3.rb
and your off.
I do these series because it helps me with my own understanding of mathematics, if you or anyone else has a better trick, or can figure out how to cut the cpu overhead from my prime number calculator, I'd be thrilled.
Happy Hacking
