main.tex 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500
  1. % ------------------------------------------------------------------------
  2. \documentclass{llncs}
  3. \input{prelude}
  4. \begin{document}
  5. \title{$\mbox{\EightStarBold}$ Discrepancies for generalized Halton points\\
  6. Comparison of three heuristics for generating points set}
  7. % \titlerunning{} % abbreviated title (for running head)
  8. % also used for the TOC unless
  9. % \toctitle is used
  10. \author{Thomas Espitau$\mbox{}^{\mbox{\SnowflakeChevron}}$ \and Olivier Marty$\mbox{}^{\mbox{\SnowflakeChevron}}$}
  11. %
  12. % \authorrunning{} % abbreviated author list (for running head)
  13. %
  14. %%%% list of authors for the TOC (use if author list has to be modified)
  15. % \tocauthor{}
  16. %
  17. \institute{$\mbox{}^{\mbox{\SnowflakeChevron}}$ ENS Cachan}
  18. \maketitle
  19. \makeatletter
  20. \renewcommand\bibsection%
  21. {
  22. \section*{\refname
  23. \@mkboth{\MakeUppercase{\refname}}{\MakeUppercase{\refname}}}
  24. }
  25. \makeatother
  26. \begin{abstract}
  27. Geometric discrepancies are standard measures to quantify the irregularity of
  28. distributions. They are an important notion in numerical integration.
  29. One of the most important discrepancy notions is the so-called star
  30. discrepancy. Roughly speaking, a point set of low star discrepancy value
  31. allows for a small approximation error in quasi-Monte Carlo integration.
  32. In this work we present a tool realizing the implementation of three
  33. basics heuristics for the construction of low discrepancy points sets
  34. in the generalized Halton model: fully random search, local search with
  35. simulated annealing and genetic $(5+5)$ search with a ad-hoc
  36. crossover function.
  37. \end{abstract}
  38. \section{General architecture of the tool}
  39. The testing tool is aimed to be modular: it is made of independents blocks that
  40. are interfaced trough a scheduler. More precisely a master wrapper is written
  41. in Python that calls a first layer which performs the chosen heuristic. This
  42. layer is written in C++ for performances. The given discrepancy algorithm
  43. --- written in C --- is called when evaluations of a state is needed.
  44. The wrapper dispatch the computations on the multi-core architecture of
  45. modern computers\footnote{for us, between 2 and 4 physical cores and 4 or 8
  46. virtual cores}. This basic architecture is described in figure~\ref{main_flow}.
  47. More precisely the class diagram for the unitary test layer is
  48. presented in figure~\ref{class_flow}.
  49. Experiments were conducted on two machines:
  50. \begin{itemize}
  51. \item 2.4 GHz Intel Dual Core i5 hyper-threaded to 2.8GHz, 8 Go 1600 MHz DDR3.
  52. \item 2.7 GHz Intel Quad Core i7 4800MQ hyper-threaded to 3.7GHz, 16 Go 1600 MHz DDR3.
  53. \end{itemize}
  54. \begin{figure}
  55. \includegraphics[scale=0.6]{main_flow.pdf}
  56. \caption{Tool overview}
  57. \label{main_flow}
  58. \end{figure}
  59. \begin{figure}
  60. \includegraphics[scale=1]{graph_classes.pdf}
  61. \caption{Class dependencies}
  62. \label{class_flow}
  63. \end{figure}
  64. On these machines, some basic profiling has made clear that
  65. the main bottleneck of the computations is hiding in the \emph{computation
  66. of the discrepancy}. The chosen algorithm and implantation of this
  67. cost function is the DEM-algorithm~\cite{Dobkin} of
  68. \emph{Magnus Wahlstr\o m}~\cite{Magnus}.\medskip
  69. All the experiments has been conducted on dimension 2,3, and 4
  70. --- with a fixed Halton prime basis 7, 13, 29, and 3. Some minor tests have
  71. been made in order to discuss the dependency of the discrepancy and
  72. efficiency of the heuristics with regards to the values chosen for the
  73. prime base. The average results remains roughly identical when taking
  74. changing these primes and taking them in the range [2, 100]. For such
  75. a reason we decided to pursue the full computations with a fixed
  76. basis.
  77. \subsection{Algorithmic insights}
  78. To perform an experiment we made up a
  79. loop above the main algorithm which calls the chosen heuristic multiple
  80. times in order to smooth up the results and obtain more exploitable datas.
  81. Then an arithmetic mean of the result is performed on the values. In addition
  82. extremal values are also given in order to construct error bands graphs.
  83. Graph are presented not with the usual box plots to show the
  84. error bounds, but in a more graphical way with error bands. The graph
  85. of the mean result is included inside a band of the same color which
  86. represents the incertitude with regards to the values obtained.
  87. A flowchart of the conduct of one experiment is described in the
  88. flowchart~\ref{insight_flow}. The number of iteration of the heuristic is
  89. I and the number of full restart is N. The function Heuristic() corresponds to
  90. a single step of the chosen heuristic.
  91. We now present an in-depth view of
  92. the implemented heuristics.
  93. \begin{figure}
  94. \begin{mdframed}
  95. \includegraphics[scale=0.4]{insight.pdf}
  96. \caption{Flowchart of a single experiment}
  97. \label{insight_flow}
  98. \end{mdframed}
  99. \end{figure}
  100. \section{Heuristics developed}
  101. \subsection{Fully random search (Test case)}
  102. The first heuristic implemented is the random search. We generate
  103. random permutations, compute the corresponding sets of Halton points
  104. and select the best set with regards to its discrepancy.
  105. The process is wrapped up in the
  106. flowchart~\ref{random_flow}. In order to generate at each step random
  107. permutations, we transform them directly from the previous ones.
  108. More precisely the permutation is a singleton object which have a method
  109. random, built on the Knuth Fisher Yates shuffle. This algorithm allows
  110. us to generate an uniformly chosen permutation at each step. We recall
  111. this fact and detail the algorithm in the following section.
  112. \begin{figure}
  113. \begin{mdframed}
  114. \includegraphics[scale=0.4]{flow_rand.pdf}
  115. \caption{Flowchart of the random search}
  116. \label{random_flow}
  117. \end{mdframed}
  118. \end{figure}
  119. \subsubsection{The Knuth-Fisher-Yates shuffle}
  120. The Fisher–Yates shuffle is an algorithm for generating a random permutation
  121. of a finite sets. The Fisher–Yates shuffle is unbiased, so that every
  122. permutation is equally likely. We present here the Durstenfeld variant of
  123. the algorithm, presented by Knuth in \emph{The Art of Computer programming}
  124. vol. 2~\cite{Knuth}.
  125. The algorithm's time complexity is here $O(n)$, compared to $O(n^2)$ of
  126. the naive implementation.
  127. \begin{algorithm}[H]
  128. \SetAlgoLined
  129. \SetKwFunction{Rand}{Rand}
  130. \SetKwFunction{Swap}{Swap}
  131. \KwData{A table T[1..n]}
  132. \KwResult{Same table T, shuffled}
  133. \For{$i\leftarrow 1$ \KwTo $n-1$}{
  134. $j \leftarrow$ \Rand{$[1,n-i]$}\;
  135. \Swap{$T[i], T[i+j]$}\;
  136. }
  137. \caption{KFY algorithm}
  138. \end{algorithm}
  139. \begin{lemma}
  140. The resulting permutation of KFY is unbiased.
  141. \end{lemma}
  142. \begin{proof}
  143. Let consider the set $[1,\ldots n]$ as the vertices of a random graph
  144. constructed as the trace of the execution of the algorithm:
  145. an edge $(i,j)$ exists in the graph if and only if the swap of $T[i]$ and
  146. $T[j]$ had been executed. This graph encodes the permutation represented by
  147. $T$. To be able to encode any permutation the considered graph must be
  148. connected --- in order to allow any pairs of points to be swapped.
  149. Since by construction every points is reached by an edge, and that there
  150. exists exactly $n-1$ edges, we can conclude directly that any permutation can
  151. be reached by the algorithm. Since the probability of getting a fixed graph
  152. of $n-1$ edges with every edges of degree at least one is $n!^{-1}$, the
  153. algorithm is thus unbiased.
  154. \end{proof}
  155. \subsubsection{Results and stability}
  156. We first want to analyze the dependence of the results on the number of
  157. iterations of the heuristic, in order to discuss its stability.
  158. The results are compiled in the figures~\ref{rand_iter2} and~\ref{rand_iter3},
  159. restricted to a number of points between 80 and 180.
  160. We emphasize on the fact lot of datas appears on graphs,
  161. and error bands representation make them a bit messy. These graphs
  162. were made for extensive internal experiments and parameters researches.
  163. The final wrap up graphs are much more lighter and only present the best
  164. results obtained.
  165. As expected from a fully random search, error bands are very large for
  166. low number of iterations ($15\%$ of the value for 400 iterations) and tend
  167. to shrink with a bigger number of iterations (around $5\%$ for 1500 iterations).
  168. This shrinkage is a direct consequence of well known concentrations bounds
  169. (Chernoff and Asuma-Hoeffding).
  170. The average results are quite stable, they decrease progressively with
  171. the growing number of iterations, but seem to get to a limit after 1000
  172. iterations. This value acts as a threshold for the interesting number of iterations.
  173. As such interesting results can be conducted with \emph{only} 1000 iterations,
  174. without altering too much the quality of the set with regards to its
  175. discrepancy and this heuristic.
  176. \begin{figure}
  177. \includegraphics[scale=0.3]{Results/random_iter.png}
  178. \caption{Dependence on iterations, dimension 2}
  179. \label{rand_iter2}
  180. \end{figure}
  181. \begin{figure}
  182. \includegraphics[scale=0.3]{Results/random_iter_3.png}
  183. \caption{Dependence on iterations, dimension 3}
  184. \label{rand_iter3}
  185. \end{figure}
  186. \subsection{Local search with simulated annealing}
  187. The second heuristic implemented is a randomized local search with
  188. simulated annealing. This heuristic is inspired by the physical
  189. process of annealing in metallurgy.
  190. Simulated annealing interprets the physical slow cooling as a
  191. slow decrease in the probability of accepting worse solutions as it
  192. explores the solution space.
  193. More precisely a state is a $d$-tuple of permutations, one for each dimension,
  194. and the neighborhood is the set of $d$-tuple of permutations which can be obtained
  195. by application of exactly one transposition of one of the permutations of
  196. the current state.
  197. The selection phase is dependant on the current temperature:
  198. after selecting randomly a state in the neighborhood, either
  199. the discrepancy of the corresponding Halton set is decreased and the
  200. evolution is kept, either it does not but is still kept with
  201. a probability $e^{\frac{\delta}{T}}$ where $\delta$ is the difference
  202. between the old and new discrepancy, and $T$ the current temperature.
  203. If the discrepancy has decreased, the temperature $T$ is multiplied
  204. by a factor $\lambda$ (fixed to $0.992$ in all our simulations), hence
  205. is decreased.
  206. The whole algorithm is described in the flowchart~\ref{flow_rec}.
  207. \begin{figure}
  208. \begin{mdframed}
  209. \includegraphics[scale=0.4]{flow_recuit.pdf}
  210. \caption{Flowchart of the local search with simulated annealing heuristic}
  211. \label{flow_rec}
  212. \end{mdframed}
  213. \end{figure}
  214. \subsubsection{Dependence on the temperature}
  215. First experiments were made to select the best initial temperature.
  216. Results are compiled in graphs~\ref{temp_2},~\ref{temp3}, and~\ref{temp3_z}.
  217. Graphs~\ref{temp_2} and~\ref{temp3} represent the results obtained respectively
  218. in dimension 2 and 3 between 10 and 500 points. The curve obtained is
  219. characteristic of the average evolution of the discrepancy optimization
  220. algorithms for Halton points sets: a very fast decrease for low number of
  221. points --- roughly up to 80 points --- and then a very slow one
  222. after~\cite{Doerr}.
  223. The most interesting part of these results are concentrated between 80 and 160
  224. points were the different curves splits. The graph~\ref{temp3_z} is a zoom
  225. of~\ref{temp3} in this window. We remark on that graph that the lower the
  226. temperature is, the best the results are, with a threshold at $10^{-3}$.
  227. \begin{figure}
  228. \includegraphics[scale=0.3]{Results/resu_2_temp.png}
  229. \caption{Dependence on initial temperature: D=2}
  230. \label{temp_2}
  231. \end{figure}
  232. \begin{figure}
  233. \includegraphics[scale=0.3]{Results/resu_temp3.png}
  234. \caption{Dependence on initial temperature: D=3}
  235. \label{temp3}
  236. \end{figure}
  237. \begin{figure}
  238. \includegraphics[scale=0.3]{Results/resu_temp3_zoom.png}
  239. \caption{Dependence on initial temperature (zoom): D=3}
  240. \label{temp3_z}
  241. \end{figure}
  242. \subsubsection{Stability with regards to the number of iterations}
  243. As for the fully random search heuristic we investigated the stability
  244. of the algorithm with regards to the number of iterations. We present
  245. the result in dimension 3 in the graph~\ref{iter_sa}. Once again we
  246. restricted the window between 80 and 180 points were curves are split.
  247. An interesting phenomena can be observed: the error rates are somehow
  248. invariant w.r.t.\ the number of iterations and once again the 1000 iterations
  249. threshold seems to appear --- point 145 is a light split between iteration
  250. 1600 and the others, but excepted for that point, getting more than 1000
  251. iterations tends be be a waste of time. The error rate is for 80 points the
  252. biggest and is about $15\%$ of the value, which is similar to the error
  253. rates for fully random search with 400 iterations.
  254. \begin{figure}
  255. \includegraphics[scale=0.3]{Results/sa_iter.png}
  256. \caption{Dependence on iterations number for simulated annealing : D=3}
  257. \label{iter_sa}
  258. \end{figure}
  259. \subsection{Genetic $(\mu+\lambda)$ search}
  260. The third heuristic implemented is the $(\mu+\lambda)$ genetic search.
  261. This heuristic is inspired from the evolution of species: a family
  262. of $\mu$ genes is known (they are generated randomly at the beginning),
  263. from which $\lambda$ new genes are derived. A gene is the set of parameters
  264. we are optimizing, i.e. the permutations.
  265. Each one is derived either from one gene applying a mutation
  266. (here a transposition of one of the permutations), or from two
  267. genes applying a crossover: a blending of both genes (the
  268. algorithm is described in details further). The probability of
  269. making a crossover rather than a mutation is $c$, the third parameter of the algorithm,
  270. among $\mu$ and $\lambda$. After that, only the $\mu$ best
  271. genes are kept, according to their fitness, and the evolutionary process
  272. can start again.
  273. Because making variations over $\mu$ or $\lambda$ does not change fundamentally
  274. the algorithm, we have chosen to fix $\mu=\lambda=5$ once and for all,
  275. which seemed to be a good trade-off between the running time of
  276. each iteration and the size of the family.
  277. \subsubsection{Crossover algorithm}
  278. We designed an ad-hoc crossover for permutations. The idea is simple: given two
  279. permutations $A$ and $B$ of $\{1..n\}$, it constructs a new permutation
  280. $C$ value after value, in a random order (we use our class permutation
  281. for this). For each index $i$, we take either $A_i$ or $B_i$. If exactly
  282. one of those values is available (understand it was not already chosen)
  283. we choose it. If both are available, we choose randomly and we remember
  284. the
  285. second. If both are unavailable, we choose a remembered value. The
  286. flowchart of this algorithm is described in figure~\ref{cross_flow}.
  287. \begin{figure}
  288. \begin{mdframed}
  289. \includegraphics[scale=0.4]{crossover_flow.pdf}
  290. \caption{Flowchart of the crossover algorithm.}
  291. \label{cross_flow}
  292. \end{mdframed}
  293. \end{figure}
  294. The benefits of this method are that it keeps common values of $A$ and $B$,
  295. the values $C_i$ are often among $\{A_i, B_i\}$ (hence $C$ is close to $A$
  296. and $B$), and it does not favor either the beginning or the ending of
  297. permutations.
  298. \begin{algorithm}[H]
  299. \SetAlgoLined
  300. \SetKwFunction{Rand}{Rand}
  301. \SetKwFunction{Swap}{Swap}
  302. \SetKwFunction{RPO}{a random permutation of }
  303. \SetKwFunction{RVI}{a random value in }
  304. \KwData{Two permutations A[1..n], B[1..n]}
  305. \KwResult{A permutation C[1..n]}
  306. $pi \leftarrow \RPO \{1,\dots,n\}$\;
  307. $available \leftarrow \{\}$\;
  308. $got \leftarrow \{\}$\;
  309. \For{$i\leftarrow 1$ \KwTo $n$}{
  310. $j \leftarrow pi_i$\;
  311. $a \leftarrow A_j$\;
  312. $b \leftarrow B_j$\;
  313. \If{$a \in got \land b \in got$}{
  314. $v \leftarrow \RVI available$\;
  315. }
  316. \ElseIf{$a \in got$}{
  317. $v \leftarrow b$\;
  318. }
  319. \ElseIf{$b \in got$}{
  320. $v \leftarrow a$\;
  321. }
  322. \Else{
  323. $\Swap (A, B)$ with probability $1/2$\;
  324. $v \leftarrow A_j$\;
  325. $available \leftarrow available \cup \{B_j\}$\;
  326. }
  327. $C_j \leftarrow v$\;
  328. $got \leftarrow got \cup \{v\}$\;
  329. $available \leftarrow available \setminus \{v\}$\;
  330. }
  331. \caption{Permutations crossover}
  332. \end{algorithm}
  333. \subsubsection{Dependence on the parameter $c$}
  334. First experiments were made to select the value for the crossover parameter
  335. $c$. Results are compiled in graphs~\ref{res_gen2},~\ref{res_gen2z},\ref{res_gen3}
  336. and~\ref{res_gen4}.
  337. Graph~\ref{res_gen2}, represents the results obtained
  338. in dimension 2 between 10 and 500 points. The curve obtained is, with no
  339. surprise again,
  340. the characteristic curve of the average evolution of the discrepancy we already
  341. saw with the previous experiments.
  342. The most interesting part of these results are concentrated --- once again ---
  343. between 80 and 160 points were the different curves splits.
  344. The graph~\ref{res_gen2z} is a zoom of~\ref{res_gen2} in this window, and
  345. graphs~\ref{res_gen3} and~\ref{res_gen4} are focused directly into it too.
  346. We remark that in dimension 2, the results are better for $c$ close to $0.5$
  347. whereas for dimension 3 and 4 the best results are obtained for $c$ closer to
  348. $0.1$, that is a low probability of making a crossover.
  349. \begin{figure}
  350. \includegraphics[scale=0.3]{Results/res_gen_2.png}
  351. \caption{Dependence on parameter $c$: D=2}
  352. \label{res_gen2}
  353. \end{figure}
  354. \begin{figure}
  355. \includegraphics[scale=0.3]{Results/res_gen_2_zoom.png}
  356. \caption{Dependence on parameter $c$ (zoom): D=2}
  357. \label{res_gen2z}
  358. \end{figure}
  359. \begin{figure}
  360. \includegraphics[scale=0.3]{Results/res_gen_3_zoom.png}
  361. \caption{Dependence on parameter $c$: D=3}
  362. \label{res_gen3}
  363. \end{figure}
  364. \begin{figure}
  365. \includegraphics[scale=0.3]{Results/res_gen_4_zoom.png}
  366. \caption{Dependence on parameter $c$: D=4}
  367. \label{res_gen4}
  368. \end{figure}
  369. \subsubsection{Stability}
  370. Once again we investigated the stability
  371. of the algorithm with regards to the number of iterations. Once again we
  372. restricted the window between 80 and 180 points were curves are split.
  373. The results are compiled in graph~\ref{gen_iter}.
  374. An interesting phenomena can be observed: the error rates are getting really
  375. big for 1400 iterations at very low points (up to 120),
  376. even if the average results are stables after the threshold 1000 iterations,
  377. like we get before.
  378. \begin{figure}
  379. \includegraphics[scale=0.3]{Results/gen_iter.png}
  380. \caption{Stability w.r.t.\ number of iterations: D=2}
  381. \label{gen_iter}
  382. \end{figure}
  383. \section{Results and conclusions}
  384. Eventually we made extensive experiments to compare the three previously
  385. presented heuristics. The parameters chosen for the heuristics have been
  386. guessed using the experiments conducted in the previous sections.
  387. Results are compiled in the last
  388. figures~\ref{wrap2},~\ref{wrap2z},~\ref{wrap3z}, and~\ref{wrap4z}. The
  389. recognizable curve of decrease
  390. of the discrepancy is still clearly recognizable in the graph~\ref{wrap2},
  391. made for points ranged between 10 and 600. We then present the result
  392. in the --- now classic --- window 80 points - 180 points.
  393. For all dimensions, the superiority of non-trivial algorithms --- simulated
  394. annealing and genetic search --- is clear over fully random search.
  395. Both curves for these heuristics are way below the error band of random
  396. search. As a result \emph{worse average results of non trivial heuristics are
  397. better than best average results when sampling points at random}.
  398. In dimension 2~\ref{wrap2z}, the best results are given by the simulated annealing,
  399. whereas in dimension 3 and 4~\ref{wrap3z},~\ref{wrap4z}, best results are
  400. given by genetic search. It is also noticeable that in that range
  401. of points the error rates are roughly the same for all heuristics:
  402. \emph{for 1000 iteration, the stability of the results is globally the
  403. same for each heuristic}.
  404. \begin{figure}
  405. \includegraphics[scale=0.3]{Results/wrap_2.png}
  406. \caption{Comparison of all heuristics: D=2}
  407. \label{wrap2}
  408. \end{figure}
  409. \begin{figure}
  410. \includegraphics[scale=0.3]{Results/wrap_2_zoom.png}
  411. \caption{Comparison of all heuristics (zoom): D=2}
  412. \label{wrap2z}
  413. \end{figure}
  414. \begin{figure}
  415. \includegraphics[scale=0.3]{Results/wrap_3.png}
  416. \caption{Comparison of all heuristics: D=3}
  417. \label{wrap3z}
  418. \end{figure}
  419. \begin{figure}
  420. \includegraphics[scale=0.3]{Results/wrap_4.png}
  421. \caption{Comparison of all heuristics: D=4}
  422. \label{wrap4z}
  423. \end{figure}
  424. \section*{Acknowledgments}
  425. We would like to thank Magnus Wahlstrom from the Max Planck Institute for Informatics
  426. for providing an implementation of the DEM algorithm.
  427. We would also like to thank Christoph D\"urr and Carola Doerr
  428. for several very helpful talks on the topic of this work.
  429. Both Thomas Espitau and Olivier Marty are supported by the French Ministry for
  430. Research and Higher Education, trough the \'Ecole Normale Supérieure.
  431. \bibliographystyle{alpha}
  432. \bibliography{bi}
  433. \end{document}