I regularly quote these rules - online and offline - and over the years have recognized an important addendum to rule 4:
4b. Tuition varies
In Dr. Cherie's book, each rule is elaborated in a separate chapter that begins with an introduction and is followed by a few sections presenting themes (or perhaps lesson plans) relating to the rule. For Rule 3 - There are no mistakes, only lessons - the themes are compassion, forgiveness, ethics and humor. Here is an excerpt from the introduction for this chapter:
Rather than viewing your own mistakes as failures and others' mistakes as slights, you can view them as opportunities to learn. As Emerson said, "Every calamity is a spur and a valuable hint." Every situation in which you do not live up to your own expectations is an opportunity to learn something about your own thoughts and behaviors. Every situation in which you feel "wronged" by another person is a chance to learn something about your reactions. Whether it is your own wrongdoing or someone else's, a mistake is simply an opportunity to evolve further along your spiritual path.
The chapter on Rule 4 - Lessons are repeated as needed - includes elaborations on awareness, willingness, causality and patience. It begins with an introduction that articulates the following insight:
You will continually attract the same lessons into your life. You will also draw to you teachers to teach you that lesson until you get it right. The only way you can free yourself of difficult patterns and issues you tend to repeat is by shifting your perspective so that you can recognize the patterns and learn the lessons that they offer. You may try to avoid the situations, but they will eventually catch up with you.
I regularly experience the wisdom of these rules, with varying degrees of awareness. It often seems to be the case that I recognize their relevance retroactively. Over time, I have also learned that the lessons have variable costs, which exhibit a general upward trend. The costs can take different forms - money, opportunities, friends - but the most common currency appears to be emotional pain. In a 12-step group I used to attend, we often light-heartedly referred to the repetition of an emotionally painful lesson as another f***ing growth opportunity, or AFGO for short.
Emotional pain experienced during lessons is often magnified by my unwillingness to practice self-forgiveness - for which two specific lesson plans are highlighted in the chapters on rules 3 and 4 - and my tendency to add a layer of self-criticism when I recognize that I've unwittingly and unintentionally repeated a mistake lesson yet again. But when the lessons involve other people - in either a cooperative or adversarial learning opportunity - the emotional pain also arises from my guilt over the tuition they pay ... fees that, in many cases, my co-learners never explicitly signed up for.
The highest student debt accumulated for my repeated lessons involve my wife and children. The costs in terms of the emotional pain I experience when I fail to live up to my own expectations of being an attentive and effective husband and father are enormous, and greatly intensify each time lessons are repeated. And, unfortunately, the costs borne by them through such lessons can also be considerable.
Marriage vows traditionally take into account tuition costs for partners' lessons, or at least that's my interpretation of "for better, for worse, for richer, for poorer, in sickness and in health". Marriage, at least in our culture, involves an explicit choice by both partners, and each has the option of dropping the course(s). My wife has been one of my best, most consistent and often unappreciated teachers ... though I like to think at least some of her tuition has been reciprocated through lessons I may have offered her over the years.
Children are typically not offered an explicit opportunity to choose whether to sign up for lessons they learn with or from their parents. My children are also among my best teachers, as they regularly help me recognize hypocrisy or duplicity when I preach one principle but practice another, or when I am otherwise not living up to my - or their - expectations of being a good father. I don't know quite how to account for their tuition costs over the years, but I like to believe the process of micro-evolution will ultimately allow them to reap some valuable rewards from the dues they've been paying.
I attended my first David Whyte workshop last month, held at the First Covenant Church in Seattle. The theme was The Harvest of Winter, exploring the challenges posed by difficult harvests and the opportunities they provide for asking beautiful questions: disturbing, provocative questions whose answers can unlock deep, hidden insights. These questions become increasingly important during increasingly difficult times, when many of our traditional sources of power and security are undergoing transition.
Throughout the workshop, Whyte recited - mostly from memory - poems written by him and other poets, told stories about the contexts in which the poems originated, and highlighted the beautiful questions articulated through the poems, implicitly or explicitly. The content seemed loosely organized around a three-step process one might follow when attempting to harvest from a difficult source of nourishment and sustenance, which I will briefly enumerate here and then elaborate further below.
Turn to a different source; i.e., stop having the conversation you're having now, to create an "invitational absence" in which a new conversation can emerge.
Re-evaluate the stories you've been telling about yourself, and discard the stories that are not - or are no longer - true.
In the search for a new source, try combining different parts of your self that have never been in conversation before.
The old Latin root of that word is Conversatia and it really means a kind of "living with" or "in companionship with," so you’re having a Conversatia with your spouse or your partner at home every day. There’s a "living with" whether it’s spoken out loud or not. There is an equal kind of conversation with silence, and with a particular way that you as an individual ask the question of life. You’ve got to find that contact point as an individual. Ask the question, "Where am I interested? Where, in a very short time, do I become passionate once I’ve opened up that initial interest? What do I have energy for? And will I have faith enough to actually spend enough time that I can open up that door into what to begin with is a new territory but eventually becomes my new home?"
One of the interesting stories he shared in illustrating this conversation about passions was his experience in applying for a job as a naturalist guide aboard a schooner in the Galapagos Islands (a story I first encountered in his book, The Heart Aroused). Among the 500 applicants, 90 were invited for an on-site interview for the position, and what set him apart - and made him an ideal candidate for his ideal job - was not so much his degree in marine biology, but all the extracurricular interests he had pursued during his studies: scuba diving, rock climbing and vagabonding around Europe (and thereby picking up some foreign languages). The experience led him to adopt an abiding faith in his intrinsic interests and passions ... and subsequently to encourage others to similarly have faith in what draws them.
I've long been intrigued by the stories we make up about ourselves. Whyte characterized such stories as hermetic enclosures - bubbles that we create so we don't have to engage in wider conversations. One of the stories he once made up about himself was a need for quiet and solitude in order for him to write. Letting go of that story opened up new frontiers - internally and externally - for his subsequent writing.
White also made a non-specific reference to some story involving his mother that he learned was not true after her death. He did not share any details about this story, but I wonder if it was a story he wrote about in Crossing the Unknown Sea in which his mother told him about a dream in which she interceded to save him during a near death experience during his time in the Galapagos, which included details of the experience that he had never told her about. Regardless of the truth of that particular story, the revelation it evoked for him seems like a timeless truth:
But irrespective of the far-fetched psychic reality of it all, something else had happened inside me. I stopped trying to do it all myself. I was like everything else in this life. I didn't need to have absolute total control over my destiny. I couldn't have it anyway. ... I was given a sense of the intimate way everything is a brother and sister to everything else. Everything we see as private is somehow already out in the world. The singularity of existence is only half the story; all our singularities are in the conscious and unconscious conversation with everything else.
Whyte often uses the word marriage to describe the intimate relationships we have with our selves, other people and even our work, a theme most extensively elaborated in his book, The Three Marriages: Reimagining Work, Self and Relationship. During the workshop, he observed that there is no worthwhile path that does not risk leaving us heartbroken, and that even the longest, most successful marriages include periods of heartbreak. This notion of periodic heartbreak is a theme he writes about in one of his periodic "Letters from the House", this one on The Poetic Narratives of Our Time:
If we are sincere, every good marriage or relationship will break our hearts in order to enlarge our understanding of our self and that strange other with whom we have promised ourselves to the future. Being a good parent will necessarily break our hearts as we watch a child grow and eventually choose their own way, even through many of the same heartbreaks we have traversed. Following a vocation or an art form through decades of practice and understanding will break the idealistic heart that began the journey and replace it, if we sidestep the temptations of bitterness and self-pity, with something more malleable, compassionate and generous than the metaphysical organ with which we began the journey. We learn, grow and become compassionate and generous as much through exile as homecoming; as much through loss as gain, as much through giving things away as in receiving what we believe to be our due.
I recently encountered a related illustration and excerpt in the BrainPicker review of The ABZ of Love, which nicely captures the periodic heartbreaks involved in loving relationships ... and although that book focuses on human relationships, I believe these highs and lows - or, perhaps, dark nights of the soul - also characterize our relationships with our work.
…after swinging around a certain point for a time, very small swings to and from in either direction, a sudden drop with the resultant feeling of hopelessness [and then] once more pendulation around one point for a time, then a drop, then that hopeless feeling, improvement again, etc., etc., without ever reaching the absolute ideal. Disappointments and depressions are necessary features of any process of learning, every development.
One of the themes that pervades much of Whyte's poetry and prose is the courage to explore the edges of our identities (hence the title of the SoundsTrue interview). During the workshop, he encouraged us to look to our edges as we explore new conversations - or new stories we might make up about our selves - so that "the edge may become the center". One of his poems that he recited, Coleman's Bed, speaks to this process:
Above all, be alone with it all, a hiving off, a corner of silence amidst the noise, refuse to talk, even to yourself, and stay in this place until the current of the story is strong enough to float you out.
The story I have been making up about myself for some time now is that I am currently "between stories", a period in which I don't seem to have - or be aware of - a particularly powerful or passionate personal narrative. Considered in the context of the poem, this prompts a beautiful question about why no new story has yet gained sufficient strength to "float me out". While I find myself talking less - online and offline - I may not have been cultivating sufficient silence to allow a new story to emerge ... and upon reflection, that seems an appropriate note upon which to end this post.
We euthenized our 16-year-old dog, Jojo, last week. A veterinarian came to our house and administered the injection while JoJo was resting in her favorite spot next to the couch, surrounded by her loving family. JoJo had gone deaf, was mostly blind, and had become increasingly incontinent, confused, unsettled and frail as the summer wore on. She showed little interest in food, walks or human contact during her final days.
JoJo was a great dog. We got her from an animal shelter in 1999, where she was sharing a cell with a Rottweiler. As we walked around to see all the dogs, she stood out as being the only one that was not jumping up and down and barking. When a shelter assistant put her on a leash so we could walk around with her for a bit, it was clear that she was very affectionate, and I'm sure she reminded us of our prior dog, Patches, a Border Collie / Springer Spaniel mix (who had died at age 15, a few months earlier). We had not intended on getting another dog so soon, but JoJo - who we were told was probably a Brittany / Golden Retriever mix - won us over ... and upon our arrival home, a neighbor remarked "Oh, look - a blonde Patches!"
JoJo was always eager to please, and easy to train. Until she went deaf - and could no longer hear or respond to verbal commands - we didn't use a leash to take her for walks, and we never needed to tie her up in the yard. She learned where the boundaries were, and [usually] stayed within them. She enjoyed unstructured play but never showed much interest in fetch or other more structured games. She loved the water, but didn't particularly like swimming. She loved exploring tunnels and crawling through large pipes, but suffered from gephyrophobia (fear of bridges). She seemed to have a special affinity for mud.
Perhaps most importantly, JoJo loved people, and seemed to operate with the assumption that everyone loved her and wanted to show her affection ... an assumption that usually proved correct. She was a glutton for human contact. If anyone leaned over to pet her for more than a second or two, she would slide to the ground, roll over, and expose her belly to allow fuller access for more extensive petting. If the petter happened to be on the ground, JoJo would often roll over on top of the person's arm or leg in order to ensure the closest possible access. When someone would stop petting her, she would nudge that person's hand, presumably in case the person had simply forgotten to keep petting her. Her desire for affection knew no bounds.
I've long nursed a pet theory that the primary therapeutic benefit people derive from pets is not so much that our pets love us, but that we can love our pets ... with far less fear of the rejection we risk in loving other people. That is, it's the expression or giving of love rather than the receiving of love that really opens up the heart - and promotes other emotional and physiological benefits. Reading the article about Paro, I'm inclined to revisit and revise this theory. Perhaps it's not just loving someone - human, animal or robot - that makes us feel complete, it is the [perception of] being needed by someone we love that helps us feel like we matter ... like our life has purpose.
While my primary goal in this post is to celebrate JoJo's life, I don't want to minimize the grief we feel about her death. She was an integral part of our family for many years, through much of our children's lives. I think JoJo's death is especially hard on Amy, because she was JoJo's primary care-giver, and there is now one less living being in the household who needs her care and attention.
As was the case after the death of Patches, I am anticipating an extended period of dog-free living ... but I also would not be surprised if we seek to fill the void created by JoJo's death in the not-too-distant future.
Part of the story I make up about myself - and the feelings and judgments surrounding that story - is derived from conscious and unconscious messages I received from my father. I find revisiting and reinterpreting those messages - and making conscious choices about how I want to be in this world - is a lifetime's work. When I read Scott Berkun's post about his new book project on fathers and children, I started another round of revisitation and reinterpretation in a comment there ... which grew so long I realized it would be more appropriate to cut, paste and elaborate on it here as a blog post of its own. I'm going to take advantage of this migration and elaboration as a pretext for articulating a pet theory of microevolution that has evolved through my periodic reflections on father-child relationships in my family.
The message from my father that I revisit most often - and find most difficult to reinterpret - is: I am not worth spending time with.
When I trace the source of this message, I remember a period when I was around 9 (+/- 2) years old, when I would often ask my father if he would play ball with me - football, baseball or basketball, depending on the season. His verbal response was invariably "maybe later" (a phrase that can still trigger feelings of anger and resentment) and his subsequent actions invariably communicated "no". The message I unconsciously interpreted in his unwillingness to play ball with me was that I am not worth spending time with (i.e., it's about me, not him). The evasive way that message was communicated left me with the mostly unconscious conviction that people don't want to spend time with me, even if they say they do (or won't say they don't).
I now consciously interpret this message's origin as arising from my father's alcoholism, and on an intellectual level, after years of intermittent involvement in 12-step programs and other forms of counseling, I can see his disengagement as a symptom of that disease. And I can embrace the 3 Cs: I didn't cause it - the alcoholism or his disengagement - and couldn't control it or cure it when he was alive (he died in 1996). I don't believe my father was consciously choosing not to play ball with me, and I don't feel anger now when I think of him, but I sometimes still feel sadness. My feelings about him are tempered by what I know about his relationship with his own father, and the changes he was able to make in his own parenting.
Which brings me to my pet theory of microevolution: 99% of everything we do as parents is unconsciously channeling the behavior of one or both of our parents; 1% of what we do is based on conscious choices to reject negative parenting practices ("I will never do that to my son/daughter!"). I used to think that 1% of what we do is based on conscious choices to adopt positive parenting practices ("I will always to do that for my son/daughter") ... but I couldn't think of any positive practices I've consciously chosen to adopt. I believe we intend to consciously adopt or reject much higher percentages of our parents' parenting practices, but I so often find myself unconsciously behaving the same way my father or mother did, that I think the 99% estimate is more realistic.
Like my father, my paternal grandfather was also an alcoholic. Although my grandfather had many good qualities and was always very kind to me, my father told me that when he was growing up, his father was quick to bring out a razor strap to apply corporal punishment in disciplining his children. My father also told me that he swore he would never lay a hand - or razor strap - on his children, and he never did.
My maternal grandfather was not an alcoholic, and also had many good good qualities and was often very kind to me. However he was also very status conscious, and he regularly compared me and my accomplishments to those of my cousins. As I entered adolescence - a period during which my grades and my interest in many of the things he valued declined, while my cousins' academic and athletic accomplishments continued to shine - I was nearly always on the losing end of the comparison. He may have found me to be worth spending time with, but I don't believe he felt much pride about me, or at least not the kind of pride he so often expressed about his other grandsons.
While I'm not always sure I've made good progress in overcoming my own trance of unworthiness - the persistent conviction that "I'm not good enough" - I swore I would not pass the trance on to my children. I made conscious choices about always getting involved in my children's sports activities (as an assistant coach, scorekeeper or other administrative role) and I never - ever (!) - turned down an invitation to play ball. I also assiduously avoided temptations to engage in implicit or explicit comparisons.
Tears are welling up as I type these words ... as they do every time I hear Harry Chapin's classic song, Cats in the Cradle:
My son turned ten just the other day. He said, "Thanks for the ball, dad, come on let's play. Can you teach me to throw?" I said, "Not today, I got a lot to do." He said, "That's ok." And he walked away, but his smile never dimmmed, Said, "I'm gonna be like him, yeah. You know I'm gonna be like him."
I often wonder how much I've turned out just like my father. I am not an alcoholic but I do tend to be a workaholic, a characteristic that I share with my father and both grandfathers. My wife has told me she sometimes feels like a "computer widow", due to my repeated prioritization of work over family (prospective future blog topic: husbands, wives and microevolution). I've been concerned that, despite my steadfast intention, and regular engagement in some dimensions of my children's lives, I may have unconsciously communicated a message of unworthiness to my children. I may have been willing to play ball with them, but possibly neglected them in other important ways.
I recently asked my 18-year-old son - and, separately, my 22-year-old daughter - about whether they feel unworthy, or whether they've ever felt that I thought they were not worth spending time with. I was happy that both reported healthy feelings of worthiness and assured me that they never received any unworthiness messages from me ... although I realize that self-awareness, and self-reporting (especially to authority figures), can be highly biased.
I also sent them a link to Taming the Mammoth: Why You Should Stop Caring What Other People Think, an essay tracing the evolutionary roots of our pervasive "craving for social approval and admiration, and a paralyzing fear of being disliked" that - as a praise junkie - I found both resonant and inspiring. Both of my children said they enjoyed the essay, but they do not see themselves as being inordinately weighed down by a social survival mammoth.
I don't know what other ills I have inflicted on my children - that will probably require future rounds of revisitation and reinterpretation, by me and them - but I like to believe I've made some microevolutionary improvement over an earlier generation.
As for my own personal evolution, I've been inspired by John Hagel's recent series of posts exploring the insights and impacts of personal narratives, identifying and understanding the dysfunctional forces that may have shaped our early lives, and then consciously crafting new personal narratives that transform those challenges into gifts for ourselves and others. I still feel very much like I'm between stories, which is probably why my blog and Twitter feed have tended to be less active and more professional / technical in nature lately. I am not yet willing to craft a new personal narrative, but I am increasingly open to new visitations and interpretations ... and evolution at various scales.
I thought it would be fun to experiment with all 3: implementing the recursive C function used by Gustavo Duarte in Python, doing so inside PythonTutor (which can generate an iframe) and then embedding the PythonTutor iframe inside an IPython Notebook, which I would then embed in this blog post.
Unfortunately, I haven't achieved the trifecta: I couldn't figure out how to embed the PythonTutor iframe inside an IPython Notebook, so I will embed both of them separately here in this blog post. Across the collection, 3 flavors of visualizing recursion are shown:
simple print statement output tracing the call stack (ASCII version)
a static call stack image created by Gustavo
a dynamic call stack created automatically by PythonTutor
I'll start out with embedding Motivating and Visualizing Recursion in Python, an IPython Notebook I created to interweave text, images and code in summarizing Gustavo Duarte's compelling critique and counter-proposal for how best to teach and visualize recursion, and reimplementing his examples in Python.
I really like the way that PythonTutor enables stepping through and visualizing the call stack (and other elements of the computation). It may not be visible in the iframe above (you have to scroll to the right to see it), so I'll include a snapshot of it below.
If anyone knows how to embed a PythonTutor iframe within an IPython Notebook, please let me know, as I'd still welcome the opportunity to achieve a trifecta ... and I suspect that combining these two tools would represent even more enhanced educational opportunities for Pythonistas.
"this one got me the most excited about getting home (or back to work) to practice what I learned"
Well, I got back to work, and learned how to create an IPython Notebook. Specifically, I created one to provide a rapid "on-ramp" for computer programmers who are already familiar with basic concepts and constructs in other programming language to learn enough about Python to effectively use the Atigeo xPatterns analytics framework (or other data science tools). The Notebook also includes some basic data science concepts, utilizing material I'd earlier excerpted in a blog post in which I waxed poetic about the book Data Science for Business, by Foster Provost and Tom Fawcett, and other resources I have found useful in articulating the fundamentals of data science.
The rapid on-ramp approach was motivated, in part, by my experience with the Natural Language Toolkit (NLTK) book, which provides a rapid on-ramp for learning Python in conjunction with the open-source NLTK library to develop programs using natural language processing techniques (many of which involve machine learning). I find that IPython Notebooks are such a natural and effective way of integrating instructional information and "active" exercises that I wish I'd discovered it back when I was teaching courses using Python at the University of Washington (e.g., what came to be known as the socialbots course). I feel like a TiVO fanatic now, wanting to encourage anyone and everyone sharing any knowledge about Python to use IPython Notebooks as a vehicle for doing so.
Not only was this my first IPython Notebook, but I'm somewhat embarrassed to admit that the Python for Data Science repository represents my first contribution to GitHub. When I was teaching at UW, I regularly encouraged students to contribute to open source projects. Now I'm finally walking the talk ... better late than never, I suppose.
I'll include the contents of the repo's README.md file below. Any questions, comments or other feedback is most welcome.
This short primer on Python is designed to provide a rapid "on-ramp" for computer programmers who are already familiar with basic concepts and constructs in other programming languages to learn enough about Python to effectively use open-source and proprietary Python-based machine learning and data science tools.
The primer is spread across a collection of IPython Notebooks, and the easiest way to use the primer is to install IPython Notebook on your computer. You can also install Python, and manually copy and paste the pieces of sample code into the Python interpreter, as the primer only makes use of the Python standard libraries.
There are three versions of the primer. Two versions contain the entire primer in a single notebook:
There are several exercises included in the notebooks. Sample solutions to those exercises can be found in two Python source files:
simple_ml.py: a collection of simple machine learning utility functions
SimpleDecisionTree.py: a Python class to encapsulate a simplified version of a popular machine learning model
There are also 2 data files, based on the mushroom dataset in the UCI Machine Learning Repository, used for coding examples, exploratory data analysis and building and evaluating decision trees in Python:
agaricus-lepiota.data: a machine-readable list of examples or instances of mushrooms, represented by a comma-separated list of attribute values
I attended my first Strata conference last week. The program offered a nice blend of strategic and technical insights and experiences regarding the design and use of "big data" systems. Atigeo was a sponsor, and I spent much of my time in our booth demonstrating and discussing our xPatterns big data analytics platform (about which I may write more later). Outside the exhibit area, highlights included a demonstration of the IPython Notebook, a tutorial on neural networks and deep learning, and a panel on Data for Good. I often find it helpful to compile - and condense - my notes after returning from a conference, and am sharing them here in case they are of use to others.
After a brief review of some key historical and conceptual underpinnings of machine learning, Alex Gray delineated 3 sources of error that data scientists must contend with: finite data, wrong parameters and the wrong type of model. Techniques for reducing error include weak scaling (use more machines to model more data), strong scaling (use more machines to model the same data, faster) and exploratory data analysis and visualization tools. Demonstrations included sentiment analysis of Twitter data during the US presidential election, identification of outliers in a large data set and a visualization of Wikipedia [unfortunately, I can't find the slides or any information about these demos online]. Quotable quotes include the no free lunch theorem: "Do I have to read all of these machine learning papers to understand this concept?" "Yes."
Alice Zheng highlighted the gap between algorithms, which prefer certain data structures, and raw data, which is often amenable to certain data structures, and noted that data structures (beyond flat, 2-dimensional tables) are an often overlooked bridge between data and algorithms in data science and engineering efforts. She showed how data for movie recommendation systems and network diagnostic systems can be represented as tables, and then how representing them with graph data structures can make them much more efficient to work with. Her colleague, Carlos Guestrin, gave a more in-depth GraphLab tutorial in another session later that afternoon, which I imagine was somewhat similar to the one captured in a 42-minute video of a GraphLab session at Strata NY 2013.
Henrik Brink and Joshua Bloom highlighted the gaps between data science and production systems, emphasizing the optimization tradeoffs among accuracy, interpretability and implementability. Effectively measuring accuracy requires choosing an appropriate evaluation metric that captures the essence of what you (or your customer) cares about. Interpretability should focus on what an end user typically wants to know - why a model gives specific answers (e.g., which features are most important) - rather than what a data scientist may find most interesting (e.g., how the model works). Implementability encompasses the time and cost of putting a model into production, including considerations of integration, scalability and speed. The lessons learned from the Netflix Prize are instructive, since implementation concerns led the sponsors not to deploy the winning algorithm, even though it achieved improved accuracy.
Ted Dunning defined an anomaly as "What just happened that shouldn't?" and posited the goal of anomaly detection as "Find the problem before other people do ... But don't wake me up if it isn't really broken." In detecting heart rate anomalies, he described the creation of a dictionary of shapes representing lower level patterns in a heart rate, and then using adaptive thresholding to look for outliers. In many anomaly detection problems, he has found that many key elements can be effectively modeled as mixture distributions.
Deep learning is an extension of an old concept - multi-layer neural networks - that has recently become very popular. Ilya Sutskever provided a very accessible overview of the history, concepts and increasing capabilities of of these systems, provocatively asserting - and providing some evidence for - "Anything humans can do in 0.1 seconds, a big 10-layer network can do, too." Connecting all nodes in all layers of such a network would be prohibitively expensive; convolutional neural networks restrict the numbers of connections by mapping only subregions between different layers. Several successful (and a few unsuccessful) examples of visual object recognition were illustrated in the Google+ photo search service. References were made to Yann LeCun's related work on learning feature hierarchies for object recognition and word2vec, an open source tool for computing vector representations of words that can be used in applying deep learning to language tasks.
Kira Radinsky led off with some of the business complexities of sales cycles - due to factors such as time, cost, probability of a sale, amount of a sale - and the typically low rate of conversion (< 1%). She mentioned a number of techniques used by SalesPredict, such as automatic feature generation, classifiers as features and the use of personas to deal with sparseness and severe negative skew in CRM data. Revisiting the importance of interpretability, she described this perception problem as "Emotional AI", and gave an example where even though SalesPredict had achieved a 3-fold increase in conversion rates for a customer, they were not happy until/unless they could understand why the system was prioritizing certain leads. She also warned of the dangers of success in prediction: once customers start relying on the ranking of sales leads, they focused all their efforts on those with "A" scores, neglecting all others, leading to potential missed opportunities (since the ranking is imperfect) and further skewing of the data.
Magda Balazinska's group is exploring a number of ways of facilitating the management of big data, and focused on just one - Collaborative Query Management (CQMS) - for this session. Unfortunately, I had to step away for part of this session, but my understanding is that CQMS involves collecting successful queries and making relevant queries available to other users who appear to be following similar trajectories in exploratory data analysis. While the goals and design of the system seem reasonable, they have not yet conducted any user studies to validate whether users find the provision of relevant queries helpful in their analysis.
Max Gasner encouraged us to apply a key lesson from relational databases to big data: decoupling implementation enables abstraction. Furthermore he proposed that successful big data platform design and development should exhibit 4 properties: robust, honest, flexible and simple. He called out BigML (co-founded by my friend and former boss & CEO at Strands, Francisco Martin) as a first generation example of such a system. Echoing issues of interpretability raised in two earlier sessions, he noted that "black boxes are easy to use and hard to trust". Riffing off with a phrase popularized by Mao Tse-Tung - "let a hundred flowers bloom; let a hundred schools of thought general purpose predictive platforms contend" - he noted there is lots of room to innovate on APIs and presentation, and so lots of opportunities for companies (like ours) building general purpose predictive platforms (GPPPs).
Ben Hamner warned that while machine learning is powerful, there are lots of ways to screw up; but he claimed that all are avoidable. Potential problems include overfitting, data insufficiency (or "finite data" as Alex Gray described it), data leakage (irrelevant features in the problem representation) and solving the wrong problem (calling to mind the 12 steps for technology-centered designers). He illustrated many of these problems - and solutions - with an amusing story about the iterative development of a vision-based system for regulating access through a pet door. He also offered an amusing quote by a machine learning engineer that captured the widespread zeal for ensemble learning methods:
"We'd been using logistic regression in high dimensional feature spaces, tried a random forest on it and it improved our performance by 14%. We were going to be rich!!"
I was initially skeptical about the wisdom of scheduling a presentation on algebra in the last slot of the session, but Oscar Boykin offered an energetic and surprisingly engaging overview of semigroups (sets with associative operations), monoids (semigroups with a zero property) the value of expressing computations as associative operations. He went on to champion the value of hashing rather than sampling to arrive at approximate - but acceptable - solutions to some big data problems, using Bloom filters, HyperLogLog and Count-min sketches as examples. In addition to sharing his slides, he also offered some challenges for those interested in diving further into the topic.
A Sampling of Other Strata Presentations
I spent much of my time on Wednesday and Thursday in the exhibitors area, but did manage to get out to see a few sessions, some of which I will briefly recount below.
Geoffrey Moore, author of Crossing the Chasm, was an ideal choice for a keynote speaker at Strata, given the prevalence of references to chasms and gaps throughout many of the other sessions. Moore presented a variant of the Technology Adoption Life Cycle, noting that pragmatists - on the other side of the chasm from the early adopters - won't move until they feel pain. For consumer IT, he recommends adopting lean startup principles and leaping straight to the "tornado"; for enterprise IT, he recommends focusing on breakthrough projects with top-tier brands, and building up high value use cases with compelling reasons to buy. He also reiterated one of his most quotable big data quotes:
"Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway"
Eric Pugh shared some insights and experiences in building the Global Patent Search Network for the US Patent and Trademark Office. He and his team had to navigate tensions between two classes of developers (data people and UX people), as well as two classes of users (patent examiners and the general public). Among the lessons: don't underestimate the amount of effort required for user interface (40% for GPSN), put a clickable prototype with a subset of the data in front of users as early as possible, don't move files (use S3funnel), be careful where you sample your data (data volume can increase exponentially over time), and keep the pipeline as simple as possible.
Yann Ramin shared a broad array of problems - and solutions - in working with time series data, alerts and traces at Twitter, some of which is captured in an earlier blog post on Observability at Twitter. He made a strong case for the need to move beyond logging toward the cheap & easy real-time collection of structured statistics when working with web services (vs. programs running on host computers), highlighting the value of embedding first tier aggregation and sampling directly in large-scale web applications. Among the open source tools he illustrated were: the Finagle web service instrumentation package, a template (twitter-server) used to define Finagle-instrumented services at Twitter, and a distributed tracing system (zipkin) based on a 2010 research paper on Dapper. As with many other Strata presenters, he also had a pithy quote to capture the motivation behind much of the work he was presenting:
"When something is wrong, you need data yesterday"
Of all the talks at Strata, this one got me the most excited about getting home (or back to work) to practice what I learned. In an act of recursive storytelling, Brian Granger told a story about how to use the IPython Notebook and NBViewer (NoteBook Viewer) to compose and share a reproducible story with code and data ... by using IPython Notebook. Running NBViewer in a browser, he was able to show and execute segments of Python code and have the results returned and variously rendered in the browser window. While the demonstration focused primarily on Python, the notebook also supports a variety of other languages (including Julia, R and Ruby). A recurring theme throughout the conference was bridging gaps, and in this case, the gap was characterized as "a grand canyon between the user and the data", with IPython Notebook serving as the bridge. He had given a longer tutorial - IPython in Depth - on Tuesday, and I plan to soon use the materials there to bridge the gap from learning to doing.
[Update, 2014-04-09: I have followed through on my intention, creating and posting an IPython Notebook on Python for Data Science]
The last session I attended at Strata was also inspiring, and I plan to look for local opportunities for doing [data science for] good. The moderator and panelists have all been involved in projects involving the application of data science techniques and technologies to help local communities, typically by helping local government agencies - which often have lots of data but little understanding of how to use it - better serve their constituents. Drew Conway helped NYFD use data mining to rank buildings' fire risk based on 60 variables, enabling them to better prioritize fire department inspector's time. Rayid Ghani co-founded the Data Science for Social Good summer fellowship program at the University of Chicago last year, and Elena Eneva was one of the program mentors who was willing to take a sabbatical from her regular work to with teams of students in formulating big data solutions to community problems [disclosure: Rayid and Elena are both friends and former colleagues from my Accenture days]. Rayid noted that there are challenges in matching data science solutions to community problems, and so he developed a checklist to help identify the most promising projects (two elements: a local community or government organization that has - and can provide - access to the data, and has the capacity for action). Elena suggested that most data scientists would be surprised at how much impact they could have by applying a few simple data science techniques. If I were to attempt to summarize the panel with a quote of my own, I would riff on Margaret Meade:
Never underestimate the power of a few simple data science techniques to help a local community #strataconf#data4good
Any sufficiently large number of signals is indistinguishable from noise. I suspect this principle does not figure prominently in the consciousness of people who are live-tweeting from conferences or other physical world events, or participating in purely virtual tweet chats. I have filtered and even unfollowed several friends who have gone on live-tweeting or tweet chatting binges, as I do not care to have my main Twitter feed consumed by tweets from events I do not care about.
A tweet today from Alyssa Royse suggests I am not alone in this irritation regarding Twitter etiquette:
All of my blocked hashtags end in "con." Time for conferences to rethink how they ask people to use Twitter.
Although I do not physically attend many conferences or other tweet-worthy events these days, when I do, I have adopted a practice that others may find useful. I use the @reply mechanism to reference the event Twitter handle at the start of the tweet - which hides the tweet from anyone who does not follow both me and the event - and then use the designated event hashtag so that anyone who is explicitly following the event hashtag can also see it. Others may remain blissfuly unaware of my avid participation in and live transcription of the event highlights.
But generally speaking, I try to maintain a small footprint for my live-tweeting ... and I would like to encourage others to adopt a similar practice.
[Oops - forgot about tweet chats ...probably because I do not participate in them. Briefly, a tweet chat is a period (typically an hour) during which a moderator will post a series of questions or prompts, and then others post responses to that question, all using a designated hashtag. A similar practice can be adopted in such scenarios, in which respondents direct their responses to the moderator (or the person who posted the question) using @replies.]
I didn't physically attend Strata NY + Hadoop World this year, but I did watch the keynotes from the conference. O'Reilly Media kindly makes videos of the keynotes and slides of all talks available very soon after they are given. Among the recurring themes were haranguing against the hype of big data, the increasing utilization of Hadoop as a central platform (hub) for enterprise data, and the importance and potential impact of making data, tools and insights more broadly accessible within an enterprise and to the general public. The keynotes offered a nice mix of business (applied) & science (academic) talks, from event sponsors and other key players in the field, and a surprising - and welcome - number of women on stage.
Meanwhile, I'll include some of my notes on interesting data and insights presented by others, in the order in which presentations were scheduled, linking each presentation title to its associated video. Unlike previous postings of notes from conferences, I'm going to leave the notes in relatively raw form, as I don't have the time to add more narrative context or visual augmentations to them.
3000 people at the conference (sellout crowd), up from 700 people in 2009. Hadoop started out as a complement to traditional data processing (offering large-scale processing). Progressively adding more real-time capabilities, e.g. Impala & Cloudera search. More and more capabilities migrating form traditional platforms to Hadooop. Hadoop moving from the periphery to the architectural center of the data center, emerging as an enterprise data hub. Hub: scalable storage, security, data governance, engines for working with the data in place Spokes connect to other systems, people Announcing Cloudera 5, "the enterprise data hub" Announcing Cloudera Connect Cloud, supporting private & public cloud deployments Announcing Cloudera Connect Innovators, inaugural innovator is DataBricks (Spark real-time in-memory processing engine)
Need to focus on business needs, not the technology You can use science, technology and statistics to figure out what the answers are, but it is still am art to figure out what the right questions are How to focus on the right questions: * hire people with academic knowledge + business savvy * train everyone on analytics (internal DataCamp at Facebook for project managers, designers, operations; 50% on tools, 50% on how to frame business questions so you can use data to get the answers) * put analysts in org structure that allows them to have impact ("embedded model": hybrid between centralized & decentralized) Goals of analytics: Impact, insight, actionable insight, evangelism … own the outcome
Tony is director at the Experience Research Lab (is this the group formerly known as People & Practices?) [I'm an Intel Research alum, and Tony is a personal friend] Personal data economy: system of exchange, trading personal data for value 3 opportunities * hyper individualism (Moore's Cloud, programmable LED lights) * hyper collectivity (student projects with outside collaboration) * hyper differentiation (holistic design for devices + data) Big data is by the people and of the people ... and it should be for the people
Praises Apache, open source, github (highlighted by someone from Microsoft?) Make big data accessible (MS?) Hadoop is a cornerstone of big data Microsoft is committed to making it ready for the enterprise HD Insight (?) Azure offering for Hadoop We have a billion users of Excel, and we need to find a way to let anybody with a question get that question answered. Power BI for Office 365 Preview
A Turing test for advertising fraud Dstillery: predicting consumer behavior based on browsing histories Saw 2x performance improvement in 2 weeks; was immediately skeptical Integrated additional sources of data (10B bid requests) Found "oddly predictive websites" e.g., Women's health page --> 10x more likely to check out credit card offer, order online pizza, or reading about luxury cars Large advertising scam (botnet) 36% of traffic is non-intentional (Comscore) Co-visitation patterns Cookie stuffing Botnet behavior is easier to predict than human behavior Put bots in "penalty box": ignore non-human behavior
When it comes to big data, BI = BS Contrasts enterprises based on fiction, feeling & faith vs. fact-based enterprises Big data analytics: letting regular business people iteratively interrogate massive amounts of data in an easy-to-use way so that they can derive insight and really understand what's going on 3 layers: Deep processing + acceleration + rich analytics Product: Hadoop processing + in-memory acceleration + analytics engines + Vizboards Example: event series analytics + entity-centric data catalog + iterative segmentation
Louisiana Purchase: Lewis & Clark address a big data acquisition problem Thomas Jefferson: "Your observations are to be taken with great pains & accuracy, to be entered intelligibly, for others as well as yourself" What happens when you make data more liquid?
4 characteristics of "openness" or "liquidity" of data: * degree of access * machine readability * cost * rights
Benefits to open data: * transparency * benchmarking exposing variability * new products and services based on open data (Climate Corporation?)
How open data can enable value creation * matching supply and demand * collaboration at scale "with enough eyes on code, all bugs are shallow" --> "with enough eyes on data, all insights are shallow" * increase accountability of institutions
Open data can help unlock $3.2B [typo? s/b $3.2T?] to $5.4T in ecumenic value per year across 7 domains * education * transportation * consumer products * electricity * oil and gas * health care * consumer finance What needs to happen? * identify, prioritize & catalyze data to open * developer, developers, developers * talent (data scientists, visualization, storytelling) * address privacy confidentiality, security, IP policies * platforms, standards and metadata
Hadoop started out as a storage & batch processing system for Java programmers Increasingly enables people to share data and hardware resources Becoming the center of an enterprise data hub More and more capabilities being brought to Hadoop Inevitable that we'll see just about every kind of workload being moved to this platform, even online transaction processing
GE has created 24 data-driven apps in one year We are working with them as a Pivotal investor and a Pivotal company, we help them build these data-driven apps, which generated $400M in the past year Pivotal code-a-thon, with Kaiser Permanente, using Hadoop, SQL and Tableau
What it takes to be a data-driven company * Have an application vision * Powered by Hadoop * Driven by Data Science
Took Facebook 9 months to achieve the same number of users that it took radio 40 years to achieve (100M users) Use cases At-risk students stay in school with real-time guidance (University of Kentucky) Soccer players improve with spatial analysis of movement Visualization of cancer treatment options Big data geek challenge (SAP Lumira): $10,000 for best application idea
Social TV Lab How we can derive value from the data that is being generated by viewers today? Methodology: start with Twitter handles of TV shows, identify followers, collect tweets and their networks (followees + followers), build recommendation systems from the data (social network-based, product network-based & text-based (bag of words)). Correlate words in tweets about a show with demographics about audience (Wordle for male vs. female) 1. You can use Twitter followers to estimate viewer audience demographics 2. TV triggers lead to more online engagement 3. If brands want to engage with customers online, play an online game Real time response to advertisement (Teleflora during Super Bowl): peaking buzz vs. sustained buzz Demographic bias in sentiment & tweeting (male vs. female response to Teleflora, others) Influence = retweeting Women more likely to retweet women, men more likely to retweet men 4. Advertising response and influence vary by demographic 5. GetGlue and Viggle check-ins can be used as a reliable proxy for viewership to * predict Nielsen viewership weeks in advance * predict customer lifetime value * measure time shifting All at the individual viewer level (vs. household level)
Ultracompact satellites to image the earth on a much more frequent basis to get inside the human decision-making loop so we can help human action. Redundancy via large # of small of satellites with latest technology (vs. older, higher-reliability systems on one satellite) Recency: shows more deforestation than Google Maps, river movement (vs. OpenStreetMap) API for the Changing Planet, hackathons early next year
[No slides?] Invention of sliced bread Big data [hyped] as the biggest thing since the sliced bread Think about big data as a journey 1. It's all about discipline and knowing where you are going (vs. enamored with tech) VC $2.6B investment into big data (IBM, SAP, Oracle, … $3-4B more) 2. Understand that any of these technologies do not live in a silo. The thing that you don't want to have happen is that this thing become a science fair project. At the end of the day, this is going to be part of a broader architecture. 3. This is an investment decision, want to have a return on investment.
The Next Era of Data Analysis: next big thing is how you analyze data from many disparate sources and do it quickly. More data: Internal data + external data More speed: Fast answers + discovery Increase speed of access & speed of processing so that iterative insight becomes possible. More people: Collaboration + context Needs to become easier for everyone across the business (not just specialists) to see insights as insights are made available, have to make decisions faster. Data-aware collaboration Data harmonization Demo: 6:10-8:30
1 of 3 people in US has had a direct experience with cancer in their family 1 in 4 deaths are cancer-related Jim's mom has chronic leukemia Just got off the phone with his mom (it's his birthday), and she asked "what is it that you do?" "We use data to solve really hard problems like cancer" "When?" "Soon" Cancer is 2nd leading cause of death in children "The brain trust in this room alone could advance cancer therapy more in a year than the last 3 decades." Bjorn Brucher We can help them by predicting individual outcomes, and then proactively applying preventative measures. Big data starts with the application Stop building your big data sandboxes, stop building your big data stacks, stop building your big data hadoop clusters without a purpose. When you start with the business problem, the use case, you have a purpose, you have focus. 50% of big data projects fail (reference?) "Take that one use case, supercharge it with big data & analytics, we can take & give you the most comprehensive big data solutions, we can put it on the cloud, and for some of you, we can give you answers in less than 30 days" "What if you can contribute to the cure of cancer?" [abrupt pivot back to initial inspirational theme]
Why coding is important: By 2020, 1.4M computing jobs Women of color currently make up 3% of computing jobs in US Goal: teach 1M girls to code by 2040 Thus far: 2 years, 2000 girls, 7 states + Johannesburg, South Africa
[my favorite talk] Anything which appears in the press in capital letters, and surrounded by quotes, isn't real. There is no math solution to anything. Math isn't the answer, it's not even the question. Math is a part of the solution. Pieces of math have different biases, different things they do well, different things they do badly, just like employees. Hiring one new employee won't transform your company; hiring one new piece of math also won't transform your company. Normal distribution, bell curve: beautiful, elegant Almost nothing in the real world, is, in fact, normal. Power laws don't actually have means. Joke: How do you tell the difference between an introverted and an extroverted engineer? The extroverted one looks at your shoes instead of his own. The math that you think you know isn't right. And you have to be aware of that. And being aware of that requires more than just math skills. Science is inherently about data, so "data scientist" is redundant However, data is not entirely about science Math + pragmaticism + communication Prefers "Data artist" to data scientist Fundamentally, the hard part actually isn't the math, the hard part is finding a way to talk about that math. And, the hard part isn't actually gathering the data, the hard part is talking about that data. The most famous data artist of our time: Nate Silver. Data artists are the future. What the world needs is not more R, what the world needs is more artists (Rtists?)
[co-author of my favorite book on Data Science] Agrees with some of the critiques made by previous speaker, but rather likes the term "data scientist" Shares some quotes from Data Science and its relationship to Big Data and Data-Driven Decision Making Gartner Hype Cycle 2012 puts "Predictive Analytics" at the far right ("Plateau of Productivity") [it's still there in Gartner Hype Cycle 2013, and "Big Data" has inched a bit higher into the "Peak of Inflated Expectations"] More data isn't necessarily better (if it's from the same source, e.g., sociodemographic data) More data from different sources may help. Using fine-grained behavior data, learning curves show continued improvement to massive scale. 1M merchants, 3M data points (? look up paper) But sociodemographic + pseudo social network data still does not necessarily do better See Pseudo-Social Network Targeting from Consumer Transaction Data (Martens & Provost) Seem to be very few case studies where you have really strong best practices with traditional data juxtaposed with strong best practices with another sort of data. We see similar learning curves with different data sets, characterized by massive numbers of individual behaviors, each of which probably contains a small amount of information, and the data items are sparse. See Predictive Modelling with Big Data: Is Bigger Really Better? (Enrique Junque de Fortuny, David Martens & Foster Provost) Others have published work on on Fraud detection (Fawcett & FP, 1997; Cortes et al, 2001), Social Network-based Marketing (Hill, et al, 2006), Online Display-ad Targeting (FP, Dalessandro, et al., 2009; Perlich, et al., 2013) Rarely see comparisons
Take home message: The Golden Age of Data Science is at hand. Firms with larger data assets may have the opportunity to achieve significant competitive advantage. Whether bigger is better for predictive modeling depends on: a) the characteristics of the data (e.g., sparse, fine-grained data on consumer behavior) b) the capability to model such data
Stuart Firestein brilliantly captures the positive influence of ignorance as an often unacknowledged guiding principle in the fits and starts that typically characterize the progression of real science. His book, Ignorance: How It Drives Science, grew out of a course on Ignorance he teaches at Columbia University, where he chairs the department of Biological Sciences and runs a neuroscience research lab. The book is replete with clever anecdotes interleaved with thoughtful analyses - by Firestein and other insightful thinkers and doers - regarding the central importance of ignorance in our quests to acquire knowledge about the world.
Each chapter leads off with a short quote, and the one that starts Chapter 1 sets the stage for the entire book:
"It is very difficult to find a black cat in a dark room," warns an old proverb. "Especially when there is no cat."
It's groping and probing and poking, and some bumbling and bungling, and then a switch is discovered, often by accident, and the light is lit, and everyone says "Oh, wow, so that's how it looks," and then it's off into the next dark room, looking for the next mysterious black feline.
Firestein is careful to distinguish the "willful stupidity" and "callow indifference to facts and logic" exhibited by those who are "unaware, unenlightened, and surprisingly often occupy elected offices" from a more knowledgeable, perceptive and insightful ignorance. As physicist James Clerk Maxwell describes it, this "thoroughly conscious ignorance is the prelude to every real advance in science."
The author disputes the view of science as a collection of facts, and instead invites the reader to focus on questions rather than answers, to cultivate what poet John Keats' calls "negative capability": the ability to dwell in "uncertainty without irritability". This notion is further elaborated by philosopher-scientist Erwin Schrodinger:
In an honest search for knowledge you quite often have to abide by ignorance for an indefinite period.
Ignorance tends to thrive more on the edges than in the centers of traditional scientific circles. Using the analogy of a pebble dropped into a pond, most scientists tend to focus near the site where the pebble is dropped, but the most valuable insights are more likely to be found among the ever-widening ripples as they spread across the pond. This observation about the scientific value of exploring edges reminds me of another inspiring book I reviewed a few years ago, The Power of Pull, wherein authors John Hagel III, John Seely Brown & Lang Davison highlight the business value of exploring edges:
Edges are places that become fertile ground for innovation because they spawn significant new unmet needs and unexploited capabilities and attract people who are risk takers. Edges therefore become significant drivers of knowledge creation and economic growth, challenging and ultimately transforming traditional arrangements and approaches.
On a professional level, given my recent renewal of interest in the practice of data science, I find many insights into ignorance relevant to a productive perspective for a data scientist. He promotes a data-driven rather than hypothesis-driven approach, instructing his students to "get the data, and then we can figure out the hypotheses." Riffing on Rodin, the famous sculptor, Firestein highlights the literal meaning of "dis-cover", which is "to remove a veil that was hiding something already there" (which is the essence of data mining). He also notes that each discovery is ephemeral, as "no datum is safe from the next generation of scientists with the next generation of tools", highlighting both the iterative nature of the data mining process and the central importance of choosing the right metrics and visualizations for analyzing the data.
Professor Firestein also articulates some keen insights about our failing educational system, a professional trajectory from which I recently departed, that resonate with some growing misgivings I was experiencing in academia. He highlights the need to revise both the business model of universities and the pedagogical model, asserting that we need to encourage students to think in terms of questions, not answers.
W.B. Yeats admonished that "education is not the filling of a pail, but the lighting of a fire." Indeed. TIme to get out the matches.
Rule Three: There are no mistakes, only lessons. Growth is a process of experimentation, a series of trials, errors, and occasional victories. The failed experiments are as much a part of the process as the experiments that work.
Rule Four: A lesson is repeated until learned. Lessons will repeated to you in various forms until you have learned them. When you have learned them, you can then go on to the next lesson.
Firestein offers an interesting spin on this concept, adding texture to my previous understanding, and helping me feel more comfortable with my own highly variable learning process, as I often feel frustrated with re-encountering lessons many, many times:
I have learned from years of teaching that saying nearly the same thing in different ways is an often effective strategy. Sometimes a person has to hear something a few times or just the right way to get that click of recognition, that "ah-ha moment" of clarity. And even if you completely get it the first time, another explanation always adds texture.
My ignorance is revealed to me on a daily, sometimes hourly, basis (I suspect people with partners and/or children have an unfair advantage in this department). I have written before about the scope and consequences of others being wrong, but for much of my life, I have felt shame about the breadth and depth of my own ignorance (perhaps reflecting the insight that everyone is a mirror). It's helpful to re-dis-cover the wisdom that ignorance can, when consciously cultivated, be strength.