In 2016, I participated for the first time in Google Code Jam (GCJ). I like games, such as Go and Chess, but I am not good at playing fast games. Similarly, I like programming, but I am never good at writing programs under time pressure. GCJ allows contestants to use any programming languages. I used Picat, a logic-based hybrid programming language, which takes features from logic programming (logic variables, unification, and backtracking), functional programming (pattern-matching rules, functions, and list/array comprehensions), dynamic programming with tabling, planning, constraint solving with CP (constraint programming), SAT (satisfiability), and MIP (Mixed Integer Programming), and imperative programming (loops and assignments). These features give me courage to enter in GCJ as they are very handy for many of the past problems. I am also fluent in C++, Java, and Python, but I would never imagine that I could compete with these languages. This page provides the solutions in Picat that I wrote for GCJ'16, after editing and simplification for better readability.
Qualification Round |
Round 1A |
Round 1B |
Round 1C |
import ordset. main => T = read_int(), foreach (TC in 1..T) N = read_int(), C0 = N.to_string().sort_remove_dups(), (count(N,N,C0,L) -> printf("Case #%w: %w\n", TC,L) ; printf("Case #%w: INSOMNIA\n", TC) ) end. table count(N,C,Ds,L), len(Ds) == 10 => L=C. count(N,C,Ds,L) => C1 = C+N, Ds1 = union(Ds, to_string(C1).sort_remove_dups()), count(N,C1,Ds1,L).
This program doesn't check the special input, N=0, for which the sheep can never get 10 digits. It uses tabling to avoid infinite looping; the call to count
fails if the starting number is 0. Tabling is useful for preventing infinite loops and avoiding redundant computations. The program would still work if the '+' operation were not monotonic.
import planner. main => T = read_line().to_int(), foreach (TC in 1..T) S = read_line(), best_plan(S,_,Len), printf("Case #%w: %w\n", TC,Len) end. final(S) => foreach(C in S) C=='+' end. action(S,NewS,Action,Cost) => Action = flip, Cost = 1, append(S1,S2,S), flip(S1,[],FS1), NewS = FS1++S2. flip([],FS0,FS) => FS=FS0. flip(['+'|S],FS0,FS) => flip(S,['-'|FS0],FS). flip([_|S],FS0,FS) => flip(S,['+'|FS0],FS).
This program models the problem as a planning problem. The input string is the initial state, and the goal state is a string of the same length that only consists of '+'. The action
predicate uses the built-in predicate append
to nondeterministically split the string into two parts, and builds a new state by flipping the prefix string.
This program easily solves the small input. However, it consumes too much time for the large input. This program can be easily improved by making the action deterministic, splitting the string at the first position where the string changes from '+' to '-', or from '-' to '+'.
import util. main => T = read_line().to_int(), foreach (TC in 1..T) [N,J] = [to_int(Token) : Token in read_line().split()], printf("Case #%w:\n", TC), do_case(N,J) end. do_case(N,J) => MinStr = ['1'] ++ ['0' : _ in 1..N-2] ++ ['1'], MaxStr = ['1' : _ in 1..N], Min2 = parse_radix_string(MinStr,2), Max2 = parse_radix_string(MaxStr,2), Map = get_global_map(), Map.put(j,J), HalfN = N div 2, between(Min2,Max2,2,Val2), %% Val2 is a member in Min2..2..Max2 Str = to_binary_string(Val2), Str1 = new_list(HalfN), once append(Str1,Str2,Str), f(Str1,Str2,HalfN,2,F2), f(Str1,Str2,HalfN,3,F3), f(Str1,Str2,HalfN,4,F4), f(Str1,Str2,HalfN,5,F5), f(Str1,Str2,HalfN,6,F6), f(Str1,Str2,HalfN,7,F7), f(Str1,Str2,HalfN,8,F8), f(Str1,Str2,HalfN,9,F9), f(Str1,Str2,HalfN,10,F10), printf("%s %w %w %w %w %w %w %w %w %w\n",Str,F2,F3,F4,F5,F6,F7,F8,F9,F10), J1 = Map.get(j), if J1 == 1 then halt else Map.put(j,J1-1), fail end. between(From,To,_Step,_X), From>To => fail. between(From,_To,_Step,X) ?=> X=From. between(From,To,Step,X) => between(From+Step,To,Step,X). table tab_pow(Base,Len) = Base**Len. f(Str1,Str2,HalfN,Base,F) => once f_aux(Str1,Str2,HalfN,Base,F). %% "once Call" can succeed only once f_aux(Str1,Str2,HalfN,Base,F) => Val1 = parse_radix_string(Str1,Base), Coe = tab_pow(Base,HalfN), Val2 = parse_radix_string(Str2,Base), (factor(Val1,F); factor(Coe,F)), factor(Val2,F). table factor(V,F) ?=> between(2,floor(sqrt(V)),F), V mod F == 0. factor(V,F) => F = V.
This program uses generate-and-test to search for a string of 0's and 1's between "10...01" and "11...11" that has a factor for each of the bases from 2 to 10. As multiple answers are required, this program keeps track of the number of generated answers by using a global map, and halts after the required number of answers has been reached.
This program uses tabling to store all the factors of a value. It is infeasible to table all of the factors of large numbers of 32 digits. The program splits a string into two halves. Let B be a base, H be the value of the higher half, and L be the value of the lower half. The value represented by the original string in base B is H*(B**M)+L, where M = N div 2. A number F is a factor of the value if it is a common factor of L and H or L and B**M. Although the program can produce correct output for both the small and large inputs, it is not complete, meaning that it may miss solutions.
import util. main => T = read_line().to_int(), foreach (TC in 1..T) [K,C,S] = [to_int(Token) : Token in read_line().split()], printf("Case #%w:", TC), do_case(K,C,S) end. do_case(K,C,S) => P = 1, foreach (_ in 1..S) printf(" %w", P), P := P + ip(K,C) + 1 end, nl. table ip(K,1) = 0. ip(K,2) = K. ip(K,C) = K**(C-1).
This program is for the small input only; it assumes that the number of students S equals the number of tiles K in the original sequence. For each tile in the original sequence, there is a tile in the sequence of complexity C that is guaranteed to be the same as the tile. The program computes the position of each tile. This idea is the same as the one given in the Contest Analysis.
Round 1A
I scored 50, which was not good enough to move me to the next round for now. I could have scored 21 more points from the large input of "Rank and File" had I had a fast computer. The problems look extremely well suited to Picat, but I made several detours during the contest.
main => T = read_line().to_int(), foreach (TC in 1..T) S = read_line(), do_case(TC,S) end. do_case(TC,S) => arrange(TC,S,'A',[],[]). arrange(TC,[],_LC,S1,S2) => printf("Case #%w: %s\n", TC,S1++reverse(S2)). arrange(TC,[C|S],LC,S1,S2), C @>= LC => %% (C @>= LC) can be written as (ord(C) >= ord(LC)) arrange(TC,S,C,[C|S1],S2). arrange(TC,[C|S],LC,S1,S2) => arrange(TC,S,LC,S1,[C|S2]).There is a brute-force algorithm for this problem, but I didn't try it (I learned a lesson from the Pancakes problem). For this problem, the greedy algorithm works well. For each letter, if the letter is greater than the leading letter in the current word, then put the letter in the front; otherwise, put it in the back. In order to avoid accessing the last letter of a list, which takes time proportional to the length of the list, the program represents the current word as two lists; the front of the first list is the font of the word, and the front of the second list is the back of the word. In this way, it takes constant time to add a letter into the front or the back of the current word.
import util,cp. main => T = read_line().to_int(), foreach (TC in 1..T) N = read_line().to_int(), Table = [{to_int(Token) : Token in read_line().split()} : _ in 1..2*N-1], not not do_case(TC,N,Table) %% "not not Call" discards bindings and terms created by "Call" end. do_case(TC,N,Table) => ST = sort(Table), A = new_array(N,N), B = transpose(A), between(1,N,MI), table_in([A[I] : I in 1..N, I !== MI]++[B[I] : I in 1..N], Table), foreach(I in 1..N, J in 2..N) A[I,J-1] #< A[I,J] end, foreach(J in 1..N, I in 2..N) A[I-1,J] #< A[I,J] end, solve(A), Rows = [A[I] : I in 1..N, I !== MI], Cols = [B[I] : I in 1..N], ST = sort(Rows++Cols), printf("Case #%w: ", TC), foreach (J in 1..N) printf("%w ",A[MI,J]) end, nl.
This program computes a layout of the N*N square grid using constraints, and finds the missing list as the result. It uses the table_in
constraint to ensure that each row and each column is among the given lists. The program easily solves the small input, but requires a fast computer with a large amount of memory to solve the large input (the submitted version, which used several table_in
constraints, ran out of memory).
This program is an example of using-a-cannon-to-kill-a-fly. The correct algorithm for the problem is to find the numbers that occur an odd number of times in the lists. This insightful idea can be used to improve the program if a configuration of the grid is required.
When using a language like Picat, there is a temptation to go for an easy solution rather than a clever one. The GCJ problem writers probably don't have constraint programming in mind. Next time I see a problem that is suited to constraint programming, I'll ask myself if I missed any clever ideas. I'll still use constraint programming as a last resort to solve small inputs.
import util,cp. main => T = read_line().to_int(), foreach (TC in 1..T) N = read_line().to_int(), BFF = {to_int(Token) : Token in read_line().split()}, not not do_case(TC,N,BFF) %% "not not Call" discards bindings and terms created by "Call" end. do_case(TC,N,BFF) => member(Size,N..-1..2), L = new_array(Size), L :: 1..N, all_distinct(L), foreach(I in 1..Size) if I==1 then Left = Size, Right = 2 elseif I==Size then Left = Size-1, Right = 1 else Left = I-1, Right = I+1 end, foreach (J in 1..N) L[I] #= J #=> (L[Left] #= BFF[J] #\/ L[Right] #= BFF[J]) end end, solve(L), printf("Case #%w: %w\n", TC,Size).
I was propelled by the same temptation to use constraint programming for this problem. This time the decision might have been well justified because it's not easy to find an efficient algorithm for the problem under time pressure. This program just iterates over the numbers from N down to 2, and builds a circle such that each kid has a friend to her left or right. The program easily solves the small input, but cannot handle the large input.
It's possible to improve the model by using extra constraints to break the symmetry and using a better strategy to label the variables. Nevertheless, when N=1000, the model is beyond the power of any normal computers because it consists of O(N^2) number of constraints.
The good news is that I wrote a program for each of the problems that could solve the small input; the bad news is that none of the programs could solve the large input. I finished 30 minutes before the competition was over. I should have waited for some time before downloading the large inputs. I only scored 35 points, which put me in the 1799th place among 7894 contestants. It's not impressive, but I have no regrets because I didn't put efforts into thinking and improving the solutions for large inputs.
main => T = read_line().to_int(), foreach (TC in 1..T) S = read_line(), printf("Case #%w: ", TC), dc(S,["ZERO", "ONE", "TWO", "THREE", "FOUR", "FIVE", "SIX", "SEVEN", "EIGHT", "NINE"],Ds), foreach (D in Ds) printf("%w", D) end, nl end. dc([],_,Os) => Os=[]. dc(S,Ds@[D|_],Os), match(S,S1,D) ?=> d(D,A), Os = [A|OsR], dc(S1,Ds,OsR). dc(S,[_|Ds],Os) => dc(S,Ds,Os). match(S,SR,[]) => S=SR. match(S,SR,[C|D]) => once select(C,S,S1), match(S1,SR,D). index (+,-) d("ZERO",0). d("ONE",1). d("TWO",2). d("THREE",3). d("FOUR",4). d("FIVE",5). d("SIX",6). d("SEVEN",7). d("EIGHT",8). d("NINE",9).
This is a very simple program. It searches in the input string for the English words, in the order from "ZERO" to "NINE", and deletes the matching word from the string. The program cannot handle the large input because of the large search space. For many cases, such as when the matching word is "FOUR" or "SIX", the rule can be made deterministic. Nevertheless, such an optimization is not adequate for solving the large input. The Contest Analysis gives the following order that does not require guessing at all:
import util, cp. main => T = read_line().to_int(), foreach (TC in 1..T) [C,J] = read_line().split(), not not do_case(TC,C,J) end. do_case(TC,C,J) => Len = len(C), Vc = new_list(Len), Vj = new_list(Len), Vc :: 0..9, Vj :: 0..9, fill(Vc,C), fill(Vj,J), val(reverse(Vc),1,Valc), val(reverse(Vj),1,Valj), Diff #= abs(Valc-Valj), solve($[min(Diff)],(Vc,Vj)), printf("Case #%w: ", TC), foreach (D in Vc) printf("%d",D) end, print(' '), foreach (D in Vj) printf("%d",D) end, nl. fill([],[]) => true. fill([_V|Vs],['?'|Ds]) => fill(Vs,Ds). fill([V|Vs],[D|Ds]) => V = ord(D)-ord('0'), fill(Vs,Ds). val([],_P,Exp) => Exp=0. val([D|Ds],P,Exp) => val(Ds,10*P,Exp1), Exp = $(P*D+Exp1).
This program uses constraint programming to assign a digit to each '?' such that the score difference is minimal. There are many cases where the search space can be reduced; for example, if the leading digit of one score is '?' and the leading digit of the other score is known to be D
, then the domain of variable should be [(D-1) mod 10, D, (D+1) mod 10]
, rather than all of the digits.
import util,sat. main => T = read_line().to_int(), foreach (TC in 1..T) N = read_line().to_int(), Topics = [Topic : _ in 1..N, Topic = read_line().split()], not not dc(TC,N,Topics) end. dc(TC,N,Topics) => Map = new_map(), num_topics(Topics,TNums,Map,0), Bs = new_list(N), Bs :: 0..1, gen_contrs(N,TNums,Bs), solve($[min(sum(Bs))],Bs), printf("Case #%w: %w\n", TC,N-sum(Bs)). num_topics([],Nums,_Map,_I) => Nums=[]. num_topics([[F,S]|Tos],Nums,Map,I) => Nums = [[FNum,SNum]|NumsR], get_num(F,Map,FNum,I,I1), get_num(S,Map,SNum,I1,I2), num_topics(Tos,NumsR,Map,I2). get_num(S,Map,Num,I,I1) => Num = Map.get(S,I), (Num == I -> Map.put(S,I), I1 = I+1 ; I1 = I ). gen_contrs(N,Topics,Bs) => ATopics = to_array(Topics), ABs = to_array(Bs), foreach (I in 1..N) ABs[I] #= 0 #=> sum([Reused1 : J in 1..N, J!==I, Reused1 #<=> (ABs[J] #= 1 #/\ ATopics[I,1] #= ATopics[J,1])]) #> 0 #/\ sum([Reused2 : J in 1..N, J!==I, Reused2 #<=> (ABs[J] #= 1 #/\ ATopics[I,2] #= ATopics[J,2])]) #> 0 end.
A topic is fake if the first and second words occur in some real topics. This program naively generates this constraint, and uses the sat
module in Picat to find the maximum number of fake topics (i.e., the minimum number of real topics). For a logical reasoning problem like this one, the sat
module tends to perform better than the cp
module. Probably this is the only program in the competition that uses SAT.
The above model is not efficient. A more efficient model is to group topics by words. For each word, we use a constraint to ensure that at least one of the topics that contain this word is real. This model is more efficient than the above model because a topic is known to be real if either of its words does not occur in any other topics. The following implements this model:
import util,sat. main => T = read_line().to_int(), foreach (TC in 1..T) N = read_line().to_int(), Topics = [Topic : _ in 1..N, Topic = read_line().split()], not not dc(TC,N,Topics) end. dc(TC,N,Topics) => Map1 = new_map(), Map2 = new_map(), Bs = new_list(N), Bs :: 0..1, foreach ({[W1,W2],B} in zip(Topics,Bs)) regist(W1,B,Map1), regist(W2,B,Map2) end, gen_constr(Map1), gen_constr(Map2), solve($[min(sum(Bs))],Bs), printf("Case #%w: %w\n", TC,N-sum(Bs)). regist(W,B,Map) => Bs = Map.get(W,[]), Map.put(W,[B|Bs]). gen_constr(Map) => foreach ((_ = Bs) in Map) sum(Bs) #>= 1 end.This program can easily solve the large input if
mip
is used instead of sat
.
Round 1C started at 5AM EDT. I had to get up at 4AM in order to have some warm-up time. I bought a cup of coffee the evening before, but that didn't help much. I solved Problem A easily but was stuck on Problem C. I misunderstood the definition of combinations, trying to count all combinations that have two garments in common (so (1,2,3),(1,2,4),(2,2,3) would count 3), not two-garment combinations as required ((1,2,3),(1,2,4),(2,2,3) should count 2). It was too late to give up on the problem, and I didn't have enough time to complete the solution for Problem B, which is easy to solve using constraint programming, at least for the small input.
import util. main => T = read_line().to_int(), foreach (TC in 1..T) N = read_line().to_int(), Ps = [to_int(Token) : Token in read_line().split()], dc(TC,N,Ps) end. dc(TC,N,Ps) => PLs = [Pair : Pair in zip(Ps,1..N)].sort_down(), Total = sum(Ps), printf("Case #%w: ", TC), printf(stderr,"Case #%w:\n", TC), ev(PLs,Total). ev([],_l) => nl. ev([{0,_}|_],_) => nl. ev([{P1,L1},{P2,L2}|PLs],Total), PLs1 = [{P1-1,L1},{P2-1,L2}|PLs].sort_down(), safe(PLs1,Total-2) => pr(L1),pr(L2),print(' '), ev(PLs1,Total-2). ev([{P,L}|PLs],Total) => pr(L),print(' '), (P-1 == 0 -> PLs1 = PLs; PLs1 = [{P-1,L}|PLs].sort_down()), ev(PLs1,Total-1). safe(PLs,Total) => foreach({P,_} in PLs) P =< Total div 2 end. pr(L) => C = ord('A')+L-1, print(chr(C)).
It is unnecessary to sort the entire list and check if each of the parties is a majority after each evacuation. However, this inefficiency causes no problem on the large input because there are at most 26 parties.
This is the program I completed after the competition. It can solve the small input, but not the large input.
import util,sat. main => T = read_line().to_int(), foreach (TC in 1..T) [B,M] = [to_int(Token) : Token in read_line().split()], not not dc(TC,B,M) end. dc(TC,B,M) ?=> Slides = new_array(B,B), Slides :: 0..1, Ways = new_array(B), Ways :: 0..M, Order = new_array(B), Order :: 1..B, Order[1] = 1, Ways[1] = M, Ways[B] = 1, foreach(I in 1..B) Slides[I,I] = 0, Slides[B,I] = 0, foreach(J in 1..B, J !== I) #~Slides[I,J] #\/ #~Slides[J,I], Slides[I,J] #=> Order[I] #< Order[J] end, if I !== B then Ways[I] #= sum([Ways[J]*Slides[I,J] : J in 1..B, J !== I]), Order[I] #< Order[B] end end, sum([Slides[I,B] : I in 1..B-1]) #> 0, solve((Slides,Ways,Order)), printf("Case #%w: POSSIBLE\n", TC), foreach (I in 1..B) foreach(J in 1..B) printf("%w",Slides[I,J]) end, nl end. dc(TC,_B,_M) => printf("Case #%w: IMPOSSIBLE\n", TC).The array
Order
is used in order to prevent cycles in paths.
This is the program I almost completed. It would have failed to solve the small input even if I had downloaded the data. It is too-time consuming to enumerate all of the subsets when (J,P,S) = (3,3,3), which has a base set of 27 combinations.
import util. main => T = read_line().to_int(), foreach (TC in 1..T) [J,P,S,K] = [to_int(Token) : Token in read_line().split()], not not do_case(TC,J,P,S,K) end. do_case(TC,J,P,S,K) => L = [{Ij,Ip,Is} : Ij in 1..J, Ip in 1..P, Is in 1..S], N = J*P*S, member(Size, N..-1..1), subset(L,N,Size,Set), check(Set,K), printf("Case #%w: %w\n", TC,Size), foreach({Ij,Ip,Is} in Set) printf("%w %w %w\n", Ij,Ip,Is) end. table subset(L,N,Size,Set),N==Size => Set=L. subset(L,N,Size,Set),N>Size => select(E,L,L1), ( Set = [E|SetR], Size1 = Size-1 ; Set=SetR, Size1 = Size ), subset(L1,N-1,Size1,SetR). table check(Set,K) => foreach({Ij,Ip,_} in Set) len([E : E in Set, E = {Ij,Ip,_}]) =< K end, foreach({Ij,_,Ik} in Set) len([E : E in Set, E = {Ij,_,Ik}]) =< K end, foreach({_,Ip,Ik} in Set) len([E : E in Set, E = {_,Ip,Ik}]) =< K end.
I made two mistakes: I misunderstood the definition of combinations, and I tried to use dynamic programming to solve this problem. I don't think I can figure out the smart algorithm given in Contest Analysis. I should have modeled the problem as a constraint optimization problem. The following is a program I wrote after the competition:
import util, sat. main => T = read_line().to_int(), foreach (TC in 1..T) [J,P,S,K] = [to_int(Token) : Token in read_line().split()], not not do_case(TC,J,P,S,K) end. do_case(TC,J,P,S,K) => L = {{Ij,Ip,Is} : Ij in 1..J, Ip in 1..P, Is in 1..S}, N = J*P*S, Bs = new_array(N), Bs :: 0..1, sum(Bs) #=< J*P*K, % pigeon-hole principle foreach (R1 in 1..N) L[R1] = {Ij,Ip,Is}, if S > K then Bs[R1] #=> sum([Bs[R2] : R2 in 1..N, L[R2] = {Ij,Ip,_}]) #=< K end, if P > K then Bs[R1] #=> sum([Bs[R2] : R2 in 1..N, L[R2] = {Ij,_,Is}]) #=< K end, if J > K then Bs[R1] #=> sum([Bs[R2] : R2 in 1..N, L[R2] = {_,Ip,Is}]) #=< K end end, solve([$max(sum(Bs))],Bs), printf("Case #%w: %w\n", TC,sum(Bs)), foreach (R in 1..N, Bs[R]==1, L[R]={Ij,Ip,Is}) printf("%w %w %w\n", Ij,Ip,Is) end.This program is a direct translation of the problem description. It easily solves the small input, using any of the constraint modules. Using
sat
, it takes 3 minutes to solve the large input with Picat version 1.9#4 on a mobile workstation I bought one week ago. I was surprised by SAT's performance in solving optimization problems of such a scale.
I didn't pass Round 1 and was not able to move on to Round 2, but I learned a lot through the qualification round and the sub-rounds of Round 1, and I had plenty of fun. I witnessed the existence of many brilliant people like Stephon Curry in the programming world. I hope I'll participate again next year.
Many of the GCJ problems have insidious smart algorithms. I don't think I am a fast reader or a superb reasoner, and I also lack experience in competitive programming and algorithm design (I write compilers in research, and 99 percent of the algorithms I use are textbook algorithms). Picat provided me with alternative ways to solve many of the problems. My solutions take advantage of Picat's features, especially tabling and constraint solving. Almost all of the combinatorial problems have some concise and declarative models that can easily solve the small inputs. I found after the competition that some of the improved models (e.g., models for Rank and File, Technobabble, and Fashion Police) can solve the large inputs within the time limit as well.
The Picat implementation is generally as fast as the Python implementation; for programs that use tail recursion, Picat can be significantly faster than Python because Picat performs tail recursion optimization. Several contestants have used Picat in recent GCJ competitions. I believe that more and more contestants will find Picat appealing.