Tuesday, May 24, 2022

Between the Lines - Pachinko

Min Jin Lee’s Pachinko is a gripping multi-generational saga of a 20th century Korean family. The story opens in a fishing village in Korea shortly after the Japanese annexation, then follows this family to Japan from the years before World War II to the late 1980’s.  I am embarrassed to admit that I knew nothing about this time period: nothing about Korea or the Korean immigrant story in Japan.  Pachinko is both a heartbreaking and tender fictional story, but it is the historical world that opened up to me that made the book even more compelling. I wouldn’t say Pachinko is perfect; the last third which covered the later years felt rushed. And some readers might not like that the book switches perspectives often and sometimes suddenly.  Personally, I did not take issue with that, and I enjoyed the stories of the minor characters as much as the major ones. Overall, I see Pachinko as a story of what it is like to be an immigrant, to be a second-class citizen, to be without a homeland, to be an outsider all while trying to forge a home and a better life for oneself and one’s children oftentimes at great sacrifice.  The book begins “History has failed us, but no matter.”  Indeed it seems as if history doesn’t care about these people, but in the end history does not diminish them, nor strip their lives of the complexity, richness, kindness and the good and bad fortune that all humans experience.  For Lee’s characters that have the fortitude to carry on in the face of great adversity, the mantra really is “no matter.”   There is too much here to summarize, after all the novel spans 8 decades, but the stand out theme for me concerns what it means to be an outsider and to what extent and to what length one will go to fit in. In Pachinko, it isn’t just the Ethnic Koreans that suffer discrimination. There are other ways to fall from grace. Lee shows compassion for and understanding of all her characters. There are those that are shunned for possessing physical deformities, ostracized because a parent commits suicide, ones hiding homosexuality, others hiding their ethnicity, adulterers, addicts, and prostitutes.  All these people struggle to be ”good” and to fit in. Lee is especially skilled in her portrayal of  Sunja and Kyunghee, whose quiet dignity and humanity provide the backdrop for all the stories found in this novel.  At the end of the book,  Sunja reflects on her life “Beyond the day lines there had been moments of shimmering beauty and some glory too, even in this ajumma’s* life. Even if no one knew, it was true.”  In the end Pachinko comes full circle -  history may have failed Sunja and her people but “no matter”  because Sunja knew what her life and the life of her family meant to her and that was enough.     

Additional Notes

The book takes its title from the popular game Pachinko, a part slot machine, part pinball game of chance.  Despite the game’s popularity, Pachinko parlors were looked down upon as dens of gambling and crime.  Ethnic Koreans in Japan,  discriminated against and shut out of traditional occupations, were forced into finding other ways to earn money and Pachinko parlors became one way of finding work and accumulating wealth.  It is estimated that 80% of Pachinko parlors in Japan today are owned by Ethnic Koreans.파친코

Pachinko was a 2017 finalist for the “National Book Award for Fiction” and was named by The New York Times as one of the 10 Best Books of 2017.

Overall

I strongly recommend you put Pachinko on your Summer reading list.  Yes, I know that summer reading lists are meant to be light and fun, an escape from reality and Pachinko is tougher than that. There is pain and heartache and grim life here, but there is also hope, love, courage and perseverance. I am also aware that summer reading lists are often books about women and largely read by women.  To that end, Pachinko fits the bill. There are plenty of male characters in Pachinko, but the book opens and closes with a woman’s story: Sunja’s story, a story of grace, love and sacrifice.  It is a page turner and at time feels like a soap opera (which I say in a non-disparaging way) as well as a well-researched and absorbing piece of historical fiction. 


Wednesday, May 11, 2022

Mixtures of Hierarchical Topics Pachinko Allocation

 

Another approach to representing the organization of topics is the pachinko allocation model (PAM) (Li & McCallum, 2006). PAM is a family of generative models in which words are generated by a directed acyclic graph (DAG) consisting of distributions over words and distributions over other nodes. A simple example of the PAM framework, four-level PAM, is described in Li and McCallum (2006). There is a single node at the top of the DAG that defines a distribution over nodes in the second level, which we refer to as super topics. Each node in the second level defines a distribution over all nodes in the third level, or sub-topics. Each sub-topic maps to a single distribution over the vocabulary. Only the sub-topics, therefore, actually produce words. The super-topics represent clusters of topics that frequently cooccur. In this paper, we develop a different member of the PAM family and apply it to the task of hierarchical topic modeling. This model, hierarchical PAM (hPAM), includes multinomial over the vocabulary at each internal node in the DAG. This model addresses the problems outlined above: we no longer have to commit to a single hierarchy, so getting the tree structure exactly right is not as important as in hLDA. Furthermore, “methodological” topics such as one referring to “points” and “players” can be shared between segments of the corpus. Computer Science provides a good example of the benefits of the hPAM model. Consider three subfields of Computer Science: Natural Language Processing, Machine Learning, and Computer Vision. All three can be considered part of Artificial Intelligence. Vision and NLP both use ML extensively, but all three subfields also appear independently. In order to represent ML as a single topic in a tree-structured model, NLP and Vision must both be children of ML; otherwise words about Machine Learning must be spread between an NLP topic, a Vision topic, and an ML-only topic. In contrast, hPAM allows higher-level topics to share lower-level topics. For this work we use a fixed number of topics, although it is possible to use nonparametric priors over the number of topics. We evaluate hPAM, hLDA and LDA based on the criteria mentioned earlier. We measure the ability of a topic model to predict unseen documents based on the empirical likelihood of held-out data given simulations drawn from the generative process of each model. We measure the ability of a model to describe the hierarchical structure of a corpus by calculating the mutual information between topics and human-generated categories such as journals. We find a 1.1% increase in empirical log likelihood for hPAM over hLDA and a five-fold increase in super-topic/journal mutual information.

2.2. hLDA
The hLDA model, which is described in Blei et al. (2004), represents the distribution of topics within documents by organizing the topics into a tree. Each document is generated by the topics along a single path of this tree. When learning the model from data, the sampler alternates between choosing a new path through the tree for each document and assigning each word in each document to a topic along the chosen path. In hLDA, the quality of the distribution of topic mixtures depends on the quality of the topic tree. The structure of the tree is learned along with the topics themselves using a nested Chinese restaurant process (NCRP). The NCRP prior requires two parameters: the number of levels in the tree and a parameter γ. At each node, a document sampling a path chooses either one of the existing children of that node, with probability proportional to the number of other documents assigned to that child, or to a new child node, with probability proportional to γ. The value of γ can therefore be thought of as the number of “imaginary” documents in an as-yet-un-sampled path.파칭코사이트인포

2.3. PAM
Pachinko allocation models documents as a mixture of distributions over a single set of topics, using a directed acyclic graph to represent topic cooccurrences. Each node in the graph is a Dirichlet distribution. At the top level there is a single node. Besides the bottom level, each node represents a distribution over nodes in the next lower level. The distributions at the bottom level represent distributions over words in the vocabulary. In the simplest version, there is a single layer of Dirichlet distributions between the root node and the word distributions at the bottom level. These nodes can be thought of as “templates” common cooccurrence patterns among topics. PAM does not represent word distributions as parents of other distributions, but it does exhibit several hierarchy-related phenomena. Specifically, trained PAM models often include what appears to be a “background” topic: a single topic with high probability in all super-topic nodes. Earlier work with four-level PAM suggests that it reaches its optimal performance at numbers of topics much larger than previously published topic models (Li & McCallum, 2006)

What Is the Game Pachinko?

  First created as a children's game in 1920s Japan, pachinko is a cross between pinball, arcade game, and slot machine. In pachinko, th...