Thursday, April 18, 2013

quantity v. quality in social 'science' research + big data

My cousin Antony pointed me at the work of Tarde (and earlier, Leibniz) on the concept of monads

A paper by Latour (see
for background and google for the full paper/chapter)

so social nets as graphs can see aggregates and individuals as properties of
the set of edges and verticies - so that lets us unify this model - provided we
capture sufficiently rich types of edges (kinship relationships, types of
friendships, encounters, co-membership of clubs, geo-spatial relations,
psycological, etc etc)

it also mighr help explain the dicomty in economics/history where most the
time, most effects are caused by large group behaviour (a la marxist analysis)
but from time to time, indivuduals wirled great influence and impact outcomes
(classical) - so this is just when someone is a hub at a time when opinions are
"hypercritical" ?-- balanced between one extreme and another -- when that
person can sway a large number around them because of their centrality and

hmm... .. ..

fits with the whole peer-progressive thing too

so this is where small data (and anecdotes and narratives) meet big data

and its also why the butterfly's wingflap causing a  hurricane could be something we'd eventually model properly (after all, a trillion butterfly wingflaps happen every year without hurricanes, so its a matter of modeling the right butterfly, or the right Genghis Kahn).

I'm also pointed at Sandra Gonzalez Bailon's paper on this:

I also like Kate Crawford's very nice talk on this topic ....

1 comment:

Jonathan Cave said...

Brief and predictable remarks:
1. Need to use nuanced or generalised graphs (n-ary rather than binary 'edges' with strength, direction, duration and state-contingency).
2. Social graphs are layered - networks of people, personalities, shared identities, etc. so quantities at one layer perceieved in another may change the natural quantification of data.
3. Social systems are complex, so aggregation is not just adding up. The 'paradox' is quite natural here; in complex systems aggregation does not reduce individual data to insignificance.
4. Strong emergence (in the persistent mutual information sense ( or reminds us of the subjectivity
5. Data are what we choose to record, so endogeneity is a problem. This matters differently if we approach data as statisticians (where good models do what data are observed to do) or as social scientists (where data are used to test hypotheses coming from models).

Blog Archive

About Me

My photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home