At great expense, Farmer John sequences the genomes of his cows. Each genome is a string of length MM built from the four characters A, C, G, and T. When he lines up the genomes of his cows, he gets a table like the following, shown here for N=3N=3:
Positions: 1 2 3 4 5 6 7 ... M Spotty Cow 1: A A T C C C A ... T Spotty Cow 2: G A T T G C A ... A Spotty Cow 3: G G T C G C A ... A Plain Cow 1: A C T C C C A ... G Plain Cow 2: A G T T G C A ... T Plain Cow 3: A G T T C C A ... T
Looking carefully at this table, he surmises that positions 2 and 4 are sufficient to explain spottiness. That is, by looking at the characters in just these two positions, Farmer John can predict which of his cows are spotty and which are not (for example, if he sees G and C, the cow must be spotty).
Farmer John is convinced that spottiness can be explained not by just one or two positions in the genome, but by looking at a set of three distinct positions. Please help him count the number of sets of three distinct positions that can each explain spottiness.
FJ有n头有斑点的牛和n头没有斑点的牛。由于他刚刚学完牛的基因学的课程,他想知道牛有没有斑点是否
与牛的基因有关。
FJ花了巨大的代价测出了每个牛的基因,每头牛的基因用一个长度为M的由“A,C,G,T”的串构成。FJ将这
些串写成一个表/矩阵,就像图中这样
(N=3的例子)
FJ仔细的观察这个表,他发现通过观测2,4位置的字符串可以预测牛是否有斑点。
(在这个例子中,假如他看到24位置是GC、AT或者AC就可以断定其有斑点,因为1号有斑点的牛24位置基因为AC,2号为AT,3号为GC,而且没有任何一头无斑点的牛的24位置出现过这三个串)
FJ认为,1个或者两个位点是不能够区分品种的,必须是刚好3个位点。他想知道能用多少组三个本质不同的位置判断牛的斑点,{1,2,3}和{1,3,2}是本质相同的
3 8
AATCCCAT
GATTGCAA
GGTCGCAA
ACTCCCAG
ACTCGCAT
ACTTCCAT
22