by Dr. Bruce McLaughlin
This article presents evidence suggesting that God was the originator of the genetic code for life and that He encrypted messages about this code in the Hebrew book of Genesis.
Introduction
The infinitude of God permits Him to instantly track all possible histories and futures of you, me and the universe with no more difficulty than it is for us to count the wheels on a bicycle. A Judeo-Christian tradition is that God arranged the 304,805 character string of concatenated words in the Torah to reveal not only a spiritual message but also to encrypt fundamental information about the beginning of the universe and its development over time including the entirety of physics, chemistry, biology and human history… a message within a message.
In Genesis of the Reciprocal Fine Structure Constant by B. McLaughlin, several critical dimensionless numbers of physics/mathematics are predicted from the first 12 characters of the first verse of Genesis. Is it possible that this same string of 12 characters provides information about the codons and corresponding amino acids of the genetic code? Given 64 codons, can the first 12 characters from the first verse of Genesis suggest which sets of codons specify a particular amino acid?
Background
The knowledge that genes consist of complex nucleotide sequences was a major breakthrough in the study of genetics. However, in the early years of discovery, the number of nucleotides needed to code for one amino acid was unknown. Four types of nucleotides were known to code for the 20 amino acids of life. Four nucleotides taken two at a time provide only 16 different combinations which are insufficient to code for 20 amino acids. Four nucleotides taken three at a time provide 64 combinations. Each three-member combination of nucleotides is called a codon. It was believed, at one time, that only 20 codons were used to specify amino acids. However, it is now understood that as many as many as six different nucleotide triplets may specify the same amino acid. The term degeneracy is used to describe the fact that a given amino acid may be specified by more than one codon. The fundamental characteristics of the genetic code are now well established; it is a non-overlapping, triplet code without punctuation. Each triplet is called a codon comprising three adjacent nucleotides and specifying a specific amino acid.
Natural Numerical Order for Codons and Amino Acids
A codon can be defined equally well by: (1) an ordered set of 3 letters from the population [U, C, A, G], (2) a base 10 number (0 – 63), (3) a base 4 triplet (000 – 333) and (4) a base 3 quadruplet (0000 – 2100). Letting the base 4 triplets 000 to 333 represent UUU to GGG (by assigning U=0, C=1, A=2 and G=3) creates a natural numerical order for the 0 to 63 codons comprising the genetic code for life (Table 1). This numerical order can also be expressed as base 3 quadruplets (0000 to 2100). Each quadruplet can be viewed as Lipschitz quaternion (a + bi + cj + dk) with a norm defined by a2 + b2 + c2 + d2. These norms, like the numerical position of the codons, alternate from even to odd which facilitates even/odd classification when working with base 3 quadruplets.
Based on observation, many amino acids of life are uniquely specified by two adjacent codons (even/odd) in the natural numerical order. If this rule were followed throughout, each adjacent pair of even/odd codons would specify a different amino acid. This is, in fact, the case for Phe, Tyr, Cys, His, Gln, Asn, Lys, Asp and Glu. Each amino acid is specified by two adjacent even/odd codons for a total of 18 codons. But this pattern is not followed for the remaining 11 amino acids. Given the 46 remaining codons, can the first 12 characters from the first verse of Genesis suggest which sets of codons specify the remaining 11 amino acids (and 3 STOP positions)?
COD BA4 BA10 AMA BA3 NORM
UUU 000 0 Phe 0000 0
UUC 001 1 Phe 0001 1
UUA 002 2 Leu 0002 4
UUG 003 3 Leu 0010 1
UCU 010 4 Ser 0011 2
UCC 011 5 Ser 0012 5
UCA 012 6 Ser 0020 4
UCG 013 7 Ser 0021 5
UAU 020 8 Tyr 0022 8
UAC 021 9 Tyr 0100 1
UAA 022 10 Stop 0101 2
UAG 023 11 Stop 0102 5
UGU 030 12 Cys 0110 2
UGC 031 13 Cys 0111 3
UGA 032 14 Stop 0112 6
UGG 033 15 Trp 0120 5
CUU 100 16 Leu 0121 6
CUC 101 17 Leu 0122 9
CUA 102 18 Leu 0200 4
CUG 103 19 Leu 0201 5
CCU 110 20 Pro 0202 8
CCC 111 21 Pro 0210 5
CCA 112 22 Pro 0211 6
CCG 113 23 Pro 0212 9
CAU 120 24 His 0220 8
CAC 121 25 His 0221 9
CAA 122 26 Gln 0222 12
CAG 123 27 Gln 1000 1
CGU 130 28 Arg 1001 2
CGC 131 29 Arg 1002 5
CGA 132 30 Arg 1010 2
CGG 133 31 Arg 1011 3
AUU 200 32 Ile 1012 6
AUC 201 33 Ile 1020 5
AUA 202 34 Ile 1021 6
AUG 203 35 Met 1022 9
ACU 210 36 Thr 1100 2
ACC 211 37 Thr 1101 3
ACA 212 38 Thr 1102 6
ACG 213 39 Thr 1110 3
AAU 220 40 Asn 1111 4
AAC 221 41 Asn 1112 7
AAA 222 42 Lys 1120 6
AAG 223 43 Lys 1121 7
AGU 230 44 Ser 1122 10
AGC 231 45 Ser 1200 5
AGA 232 46 Arg 1201 6
AGG 233 47 Arg 1202 9
GUU 300 48 Val 1210 6
GUC 301 49 Val 1211 7
GUA 302 50 Val 1212 10
GUG 303 51 Val 1220 9
GCU 310 52 Ala 1221 10
GCC 311 53 Ala 1222 13
GCA 312 54 Ala 2000 4
GCG 313 55 Ala 2001 5
GAU 320 56 Asp 2002 8
GAC 321 57 Asp 2010 5
GAA 322 58 Glu 2011 6
GAG 323 59 Glu 2012 9
GGU 330 60 Gly 2020 8
GGC 331 61 Gly 2021 9
GGA 332 62 Gly 2022 12
GGG 333 63 Gly 2100 5
Table 1. Coding for the Amino Acids of Life
Analysis
For the purpose of analysis, consider the color coded 12 by 12 matrix, designated as M (Figure 1); this matrix was central to the analysis in Genesis of the Reciprocal Fine Structure Constant by B. McLaughlin. M was constructed from the first 12 characters of the first verse of Genesis.
In this paper certain matrix elements are assigned non-black colors in accordance with specific rules. The colors are non-overlapping and each color represents a single codon (i.e. 0120 or 1022) or multiple codons grouped together as intersecting strings.
Rules of Color Assignment
- If two or more even, base 3 quadruplets, appear as intersecting strings in the matrix M, then the corresponding even/odd quadruplet pairs code for the same amino acid. A string may be read up, down, left, right or diagonally. Two intersecting strings may have no common elements or they may have multiple common elements but they must at least have one contiguous element. Also, equivalent intersecting string patterns may be found in more than one matrix location.
- Any deviation from the even/odd pattern requires that a string be identified for each individual codon.
This will account for 46 codons and 11 amino acids (plus 3 STOP). Each of the 9 remaining amino acids is specified by a single even/odd pair of codons.
0 2 0 2 1 2 0 2 0 1 2 1
0 0 0 0 0 1 0 0 0 0 0 0
1 1 0 2 0 0 1 1 0 0 2 0
0 2 0 2 1 2 0 2 0 0 2 0
0 1 0 0 0 0 0 0 0 0 0 1
0 0 0 1 0 1 1 2 1 1 1 0
0 2 0 0 0 0 1 1 0 1 0 0
2 1 2 0 0 1 2 0 0 0 0 1
0 2 0 0 0 0 1 1 0 0 0 0
1 1 0 0 2 0 1 0 0 0 1 0
0 0 1 0 0 0 0 0 1 0 0 1
0 2 0 0 0 0 0 0 0 0 2 1
Figure 1. Matrix M with Non-overlapping Clusters of Intersecting Strings
The following non-overlapping collections of intersecting strings can be identified in Figure 1.
- The yellow string designates 0120 which defines a stand-alone coding for Trp.
- The light green string designates 1022 which defines a stand-alone coding for Met.
- Brown strings 1012, 1020 and 1021 intersect to define the three strings coding for Ile.
- Light blue strings 0202 and 0211 intersect to define the 2 corresponding even/odd pairs coding for Pro.
- Orange strings 1210 and 1212 intersect to define the 2 corresponding even/odd pairs coding for Val.
- Gray strings 2020 and 2022 intersect to define the 2 corresponding even/odd pairs coding for Gly.
- Red strings 0002, 0121 and 0200 intersect to define the 3 corresponding even/odd pairs coding for Leu.
- Purple strings 1001, 1010 and 1201 intersect to define the 3 corresponding even/odd pairs coding for Arg.
- Dark green strings 0011, 0020 and 1122 intersect to define the 3 corresponding even/odd pairs coding for Ser. This string intersection also designates 1221 and 2000 which define the 2 corresponding even/odd pairs coding for Ala. The five even codons can be further separated by grouping the smallest three and the largest two.
- Dark blue strings 1100 and 1102 intersect to define the 2 corresponding even/odd pairs coding for Thr. This string intersection also designates 0101, 0102 and 0112 to define stand-alone coding for STOP. These five codons can be further separated by grouping the smallest three and the largest two.
Table 2. Corresponding Amino Acids for Non-Overlapping Clusters of Intersecting Strings
0 2 0 2 1 2 0 2 0 1 2 1
0 0 0 0 0 1 0 0 0 0 0 0
1 1 0 2 0 0 1 1 0 0 2 0
0 2 0 2 1 2 0 2 0 0 2 0
0 1 0 0 0 0 0 0 0 0 0 1
0 0 0 1 0 1 1 2 1 1 1 0
0 2 0 0 0 0 1 1 0 1 0 0
2 1 2 0 0 1 2 0 0 0 0 1
0 2 0 0 0 0 1 1 0 0 0 0
1 1 0 0 2 0 1 0 0 0 1 0
0 0 1 0 0 0 0 0 1 0 0 1
0 2 0 0 0 0 0 0 0 0 2 1
Figure 2. Matrix M with 2 Clusters of Intersecting Strings Containing All 64 Codons
As an aside, a small portion of matrix M can be used to generate base three quadruplet strings for all 64 codons as shown in Figure 2. But, in Figure 2, groups of strings cannot be connected to a particular protein.
Conclusion
By reverse engineering the non-overlapping clusters of intersecting strings in Figure 1 and using the rules for base 3 quadruplet (codon) assignment, the 64 codons point to 21 entities. Equivalently, the 64 base 4 triplets 000 to 333 (UUU to GGG) map onto 21 entities. The fact that these 21 entities comprise 20 amino acids plus STOP cannot be determined from Figure 1; nor can it be determined that the 64 codons are actually strings of nucleotides. However, the information in Figure 1 was obtained from only the first 12 Hebrew characters in the first verse of Genesis. Much additional information is encrypted in the remainder of the 304,805 characters of the Torah.