Communities & Technologies Conference (C&T 2007): Socializing and Sociologizing on the Web
July 12, 2007
The 3rd International Conference on Communities and Technologies (C&T 2007) – or cct2007, on Flickr and Slideshare – was held at Michigan State University two weeks ago. [Update: proceedings are now online.] Among the high order bits for me were the growing trend in analyzing data from normal use of large-scale social networking services (vs. designing and testing much smaller but more specialized and heavily instrumented systems used in contrived tasks), the [at times] painful recognition that I’m not really a social scientist (and probably can’t even play one on TV), and further confirmation that Africa is the new black.
Marc Smith (Microsoft Research) opened the conference with a Thursday evening keynote, signaling (and providing evidence) that the Internet is a sociologist’s playground, reviewing some of the work that he and his colleagues at the Community Technologies group have done over the years. Marc claimed that the collection of thread-o-spheres (e.g., postings on Usenet) was [still] much larger than the blogosphere, which may be true (I did not write down the number of Usenet postings Marc provided), but his estimate of 2 million active blogs contrasts sharply with figures from Technorati in the most recent State of the Live Web report, which puts that number at 70 million, growing at 120,000 per day, and generating 1.5 million posts per day. He showed some cool visualizations of newsgroup participation developed in the Netscan project, which help to graphically differentiate among answer people, question people and flame warriors, and discussed aspects of the AURA project, which uses a Windows Mobile device to scan and digitally annotate objects in the physical world.
I always enjoy Marc’s talks, and, as usual, he contributed to the expansion of my Amazon wish list, recommending books I [now] want to read, such as The Evolution of Cooperation (Robert Axelrod), Governing the Commons: The Evolution of Institutions for Collective Actions (Elinor Ostrom); I feel like I’m making progress, though, in that I’m currently reading one of the books he recommended – The Presentation of Self in Everyday Life (Erving Goffman) – and another – The Hidden Dimension (Edwin Hall) – was already on my list. Interestingly, when he asked how many in the audience had Windows Mobile phones – offering free lens attachments for them to use AURA – only a handful of people raised their hands … and I would estimate the proportion of Windows PCs (vs. Macs) I saw at the conference to be at most 25% (of course with all the fashion[able] signaling on, through and about iPhones, Microsoft clearly isn’t the only technology company whose dominance is being challenged by Apple).
Friday morning opened with another engaging keynote, this time by Rob "CmdrTaco" Malda and Jeff "Hemos" Bates, co-founders of Slashdot, on The Life, Times and Tribulations of Slashdot. Rob talks really fast, which reflects his programming style (I’m not a good programmer, but I’m a fast programmer … demonstrated, in part, by his complete rewrite of the system using mod_perl and MySQL – which he strongly prefers over Postgres – over a period of 10 days in 1999), and so he and Jeff covered a lot of ground (10 years) in a short period (1 hour). Their shared Midwest ethic (or Dutch frugality) was evident in the burn rate they described to prospective investors in 1999 – we need to eat and pay rent – and Rob’s gloat quote my day job is cooler than yours highlights important distinctions between the intrinsic vs. extrinsic motivations that often differentiate entrepreneurs from employees. Other irreverent and notable quotes include all web statistics are lies, all the real work [in advertising] is done by people who have no idea what they’re doing, Twitter is the Rubik’s Cube of Web 2.0 and give people numbers and they turn it into a game (reflecting other observations about the pervasiveness of games). In the latter context, they noted that all users are playing to win – whether their intentions are positive or negative – so it’s important to know their motivations and resources, and figure out the best tools to channel their energies. Noting that tiny minorities (1-2%) can manipulate a large, heterogeneous group rather easily, they defended their form of representative democracy (via karma) as a way to avoid lowest common denomination, and argued that radical transparency is not always a Good Thing. In addition to learning about their insights and experiences, I also added a number of terms to my vocabulary - webhead diversification, master-master replication, Daddypants systems of moderation, crapflooding and karma whoring - and I will strive to eliminate the use of using "here" as anchor text in all future blog posts.
Nicole Elison presented "Deceptive Self-Presentation in Online Dating Profiles", [the paper in the online proceedings is titled "Small, Strategic and Frequent: The True Extent and Nature of Misrepresentation in Online Dating Profiles"] detailing an experiment in which participants in an online dating service were interviewed and measured to determine how well their cyberspace profiles matched their meatspace selves. Given that [heterosexual] women tend to prefer tall, wealthy men and [heterosexual] men tend to prefer young, slender women, they expected to find discrepancies that matched these idealized profiles. They did find some evidence for the hypothesized deception strategy of frequent, subtle and strategically placed lies, claiming that men were more likely to lie about their height and socioeconomic status, and women were more likely to lie about their weight (but not their age) … although a questioner from the audience claimed that people tend to end each day about one inch shorter than they start each day, and suggested that the threshold (0.5) applied for lying about height may be too narrow.
Meg Cramer (Northwestern University), in her talk on Everything in Moderation: The Effects of Adult Moderators in Online Youth Communities, noted that parents, teachers and other authorities often express concerns about youth at risk (of at least being un[der]productive) in the ways in which they spend their time online, and conducted experiments with the Junior Summit online community to see how different levels of adult moderation (low, medium, high) affected participation in the community. She and her colleagues discovered that higher levels of adult moderation correlated with smaller numbers of generally more respectful messages. I suspect that adult moderation in a broader online community, such as MySpace, would have a similar effect … in part due to what I expect would be a mass exodus of the participants. Given Robert Epstein’s recent [controversial] book, The Case Against Adolescence, it would be interesting to investigate any differences in the effects of adult supervision on youth communities vs. adult mature non-youth other, less demographically-focused communities (which, given Marc Smith’s earlier talk, often include people who regularly engage in rather sophomoric behavior).
Moira Burke (CMU) investigated the influence of styles of expressions in her presentation on Introductions and Requests: Rhetorical Strategies that Elicit Response in Online Communities. In a series of studies, she and her colleagues found evidence that the use of I statements and asking [very specifically] for what you want were more likely to elicit responses to introductions and questions than other styles of expression in online communities (confirming the wisdom of communication strategies modeled in [my experience of] twelve-step programs and the Mankind Project). She has been applying these findings in the development of an application that will offer feedback to a prospective poster on the likelihood that a message being composed will receive a response; I think such a tool for pre-screening email messages could provide a huge boost to productivity in the workplace.
Scott Golder (HP Labs) and his colleagues looked at Rhythms of Social Interaction: Messaging within a Massive Online Network, analyzing usage patterns among 4.2 million Facebook users from 500 schools who exchanged 284 million messages and 709 million pokes over a 26 month period. Given the transitory nature of the college experience, they investigated both friendship and communication patterns among the users, or more specifically – noting the small overlap between Facebook friends and friends one might invite to one’s wedding – Facebook friending patterns and messaging patterns. Friending patterns seem to be close to the magic Dunbar number of 150 (median: 144; mean: 179), although friend whoring leads to some extreme cases, e.g., 11 users have more than 10,000 friends. Messaging within Facebook was less prevalent than friending, with a mean of 77 messages per user among the sample group. Only 15% of friends exchange messages through Facebook, although 90% of the overall message traffic occurs between friends, and very little of this traffic occurs in the mornings. This research adds to other quantitative studies done by the researchers at MSU and elsewhere; I hope that we will see some complementary qualitative studies to help us better understand the intentions and experiences of [small samples of] Facebook users … er, preferably, more active users than this user. [Oops, this may be a near violation of my vow not to use "here" as an anchor text...]
Anatoliy Gruzd (UIUC) presented "A Noun Phrase Analysis Tool for Mining Online Community Conversations", which, unlike many open source NLP toolkits (NLTK, LingPipe, MII NLP Toolkit or OpenNLP), does not require much knowledge of computational linguistics or programming expertise. After motivating the importance of developing automated mechanisms to make sense of the exponentially increasing amount of digital information – 70% of which will be user-generated, and most of it will be text-based (according to an IDC report) – Anatoliy demonstrated the Internet Community Text Analyzer tool to annotate sample texts on the web, to bootstrap a machine learning system, that can be used to extract representative excerpts (noun phrases) from other web documents. It’s been a long time since I’ve dabbled in natural language processing, machine learning and information extraction; but even then, noun phrase extraction was one of the few areas in which semi-automated systems could achieve high performance. It will be interesting to see whether / how such methods can help users deal with user-generated information overload.
Karsten Wolf (U. Bremen) explored highly, er, engaged players of World of Warcraft in his talk with the strategically question marked title "Communities of Practice in MMORPGs: An Entry Point into Addiction?" World of Warcraft currently has 8 million subscribers, nearly 10% (!) of whom are, on average, online at any given time. Analysis of 1102 responses to a survey of German players (93% of whom were men) revealed a variety of goals and aspirations for playing WoW. Those who aspire to community tend to find it, often while playing fewer hours per week then the hard core gamers, who tended to aspire to knowledge and/or reputation, and they also tend to find what they are looking for, but tend to play longer (and longer) and are [thus] more prone to develop symptoms of addiction (loss of control, withdrawal, [obsessive] mental focus, tolerance, negative consequences for work performance, negative consequences for social life) … they also have a lower tolerance for lurkers. He concluded that WoW appears to be designed to be played 40+ hours per week (I’m not sure whether he would claim it was designed for addiction (perhaps all games are)). Reflecting on some recent thoughts and discussions about passion and addiction, in which I started questioning whether passion is [always] a Good thing, I’m wondering whether addiction is always a Bad thing … and wondering how many Great things were accomplished without a level of engagement that might be viewed [by some] as an addiction.
The last event of the day was a panel on "Connected Lives: ICTs in Everyday Life" composed of Barry Wellman and some of his current and former students. It diverged considerably from my own conception of a panel, which typically includes a variety of people with divergent backgrounds and views. This group was looking at a variety of phenomena relating to online community, but there was not much divergence. I’ll just mention a bit about the people and the high order bits I took away. Bernie Hogan talked about Internet use differences between rural and urban populations (major difference is that rural folks use the Internet less for work); Anabel Quan-Haase explored the differences between local and distant social ties among students (I remember wondering about how Skype has affected this); Helen Wang presented some results from the World Internet Project sponsored by the Center for Digital Future, concluding that using the Internet has virtually no negative effects on personal relationships (my wife may dispute that claim, given her husband’s addiction devotion to blogging, her daughter’s devotion to MySpace and her son’s dvotion to Runescape).
Saturday began with a presentation by Matthew Wong and Andrew Clement (University of Toronto) on Sharing Wireless Internet in Urban Neighbourhoods. Near the outset, the audience was asked who had ever opened up their laptop looking for an open wireless access point, hoping to connect to the Internet, and nearly everyone raised their hand. I was so eager to find out how many in the audience opened up their access points to others, I nearly shouted out the question myself, but I behaved (but still wonder how many hands would have stayed up). A short time later, I received partial satisfaction: a survey reported in the paper (and presentation) revealed that 65% of respondents who use other people’s signals (access points) without permission feel little guilt, whereas 55% of respondents feel at least a little angry about people using their signals without permission. I'm not sure exactly how to calculate the hypocrisy quotient in this data, but it would have been fun to find out how it compared to the hypocrisy quotient among the attendees of the conference. Fears over open WiFi are widespread, at least in the press, and I think it would be interesting to investigate regional differences (the survey was of Toronto residents, and I was reminded of Michael Moore's film, Bowling for Columbine, in which he discovered that a large proportion of residents in Toronto do not lock their front doors).
[Judith Donath gave such an inspiring keynote in the next session that I'm going to post a separate blog entry about it. [Update: I've posted my notes on Judith's talk about signals, truth & design.]]
Anita Blanchard (UNC, Charlotte) presented "Technology and Community Behavior in Online Environments: A Work in Progress", in which she and her colleague are studying the correlation between technical features of four different online communities devoted to parenting (BabyCenter, CharlotteMommies, Phantom Scribbler and AskDrSears) and the participant behaviors and outcomes observed in those communities. She offered some engaging examples of the ways that participants present themselves in those communities - via usernames and/or signatures - and how those presentations of selves change over time, e.g., "X’s mommy" (substitute children's names for X), imakemilkwhatsyoursuperpower (my favorite), humanoven (pregnant woman), newmommy, mothersuperior, queenmother. I don't remember if they had reached any conclusions about technical features and behaviors and outcomes, but I wonder whether / how causality will be determined, i.e., do the behaviors and outcomes emerge because of the technical features, or do people choose communities with certain technical features because they desire certain behaviors and outcomes? In any case, I'll look forward to future reports on this work.
The last parallel paper session represented a dilemma for me: one track, on Social Networks, Communities, and Technologies, was very closely related to my current research; the other track, on Communities, Technologies and Bridging Social and Economic Divides, was more closely related to my inexplicable but inexhorably increasing interest in Africa ... and so [of course] I chose the latter.
Liezl Lambrecht Coetzee (University of Stellenbosch, South Africa) started the session with a spirited presentation on "World Wide Webs: Crossing the Digital Divide through Promotion of Public Access" [slides], in which she noted that the Web transcends territorial boundaries, but not economic and class boundaries: North America has 5% of the world's population, but 60% of the world's Internet users; Africa has 14% of the world's population, but only 3% of the world's Interet users (other gaps were also highlighted). Liezl reviewed the progress of the Cape Access Project, begun in July 2002, installing 36 computers in 6 public librarires, and expanded to all 98 libaries in 2006. The 100,000 users (restricted to 45 minutes / day quotas) use the computers to access information, create businesses, find jobs, communicate with relatives/friends, connect with global networks (especially women seeking out battered woman organizations) and provide public input through online surveys. After enumerating the 12 primary factors influencing real access (which involve political, economic and social dimensions at least as much, if not more than, technical issues), she ended with highlights from the Smart Cape Story competition; one of the stories, entitled Dreams are good things, included a segment where the storyteller reported I can now give expression to so much of what is within me. This was a goosebump moment for me, bringing to mind (and heart) some earlier ruminations on unfolding radiance, and helped me begin to unravel the mysterious pull Africa has on me. A Nokia phone was offered as a prize to the winner of the competition, providing yet another inkling that Nokia is a natural benefactor - and beneficiary - of helping to develop the developing world.
Venkata Ratnadeep Suri (Indiana University) talked about "Lateral Connectivity in Development Projects: Correcting the Long-Distance Bias", questioning the conventional wisdom - a metropolitan or long distance bias - that the best use of connectivity for rural environments is to link them with urban areas, and arguing instead for a Prioritization of the Lateral, in which rural autonomy can be preserved by pooling lateral resources to create regional commonwealth. Ratnadeep presented three case studies offering examples of lateral connectivity: Nemonet, connecting 7 rural schools in Missouri; Chancay-Huaral Agricultural Information Network, connecting 17 water management boards in Chancay Huaral Valley of Peru (also mentioned in a BBC report and Howard Rheingold's article on Farmers, Phones and Markets: Mobile Technology In Rural Development) and the Tanami Network, a video satellite telecom network connecting four remote Warlpiri Aborigine provinces to support daily activities such as rituals, ceremonies and classes.
Kylie Peppler (UCLA) shared experiences with High Tech Programmers in Low-income Communities: Creating a Computer Culture in a Community Technology Center, highlighting the use and impacts of the new media-rich programming language and environment, Scratch (that, synchronistically, my colleague Pertti Huuskonen had shown me just before I left for the conference). Scratch was adding a new dimension to promoting the goal of technology fluency as expressed by the National Research Council - the ability to reformulate knowledge, to express oneself creatively and appropriately, and to produce and generate information (rather than simply to comprehend it)" - in a computer clubhouse in South Los Angeles. One of the most interesting aspects of the presentation (for me) was the notion of "mentor as muse": 36 liberal arts majors / education minors (27 female, 9 male) were assigned as mentors to the [younger] students in the clubhouse, and adopted the strategy of in which the roles of mentor and mentee shifted rather fluidly, with the clubhouse members mentoring the mentors, demonstrating the value of a listen and participate (vs. command and control) paradigm in learning (and life).
The conference concluded with yet another panel that didn't seem like a panel, offering a whirlwind tour of lots of projects on collecting and analyzing lots of data. Among the projects covered were the Internet Archive, SIDGrid, the National Science Foundation's Cyber-Enabled Discovery and Innovation (CDI) Initiative, the Structure of Population, Levels of Abundance, and Status of Humpbacks (SPLASH), the Genetic Association Information Network (GAIN), and perhaps others - I was already suffering from information (and inspiration) overload (and perhaps discontinuous partial inattention) at that point, and so my notes are rather sketchy. In any case, it looks like there will be plenty more data for socializing and sociologizing about at future Communities & Technologies conferences.
[John Kuner has also posted some interesting notes from the conference]