A True History of the Internet

yes its true, all of it - the internet doesn't really exist, so it must be.

Thursday, April 18, 2013

quantity v. quality in social 'science' research + big data

My cousin Antony pointed me at the work of Tarde (and earlier, Leibniz) on the concept of monads

A paper by Latour (see
http://www.bruno-latour.fr/node/144
for background and google for the full paper/chapter)


so social nets as graphs can see aggregates and individuals as properties of
the set of edges and verticies - so that lets us unify this model - provided we
capture sufficiently rich types of edges (kinship relationships, types of
friendships, encounters, co-membership of clubs, geo-spatial relations,
psycological, etc etc)

it also mighr help explain the dicomty in economics/history where most the
time, most effects are caused by large group behaviour (a la marxist analysis)
but from time to time, indivuduals wirled great influence and impact outcomes
(classical) - so this is just when someone is a hub at a time when opinions are
"hypercritical" ?-- balanced between one extreme and another -- when that
person can sway a large number around them because of their centrality and
degree....

hmm... .. ..

fits with the whole peer-progressive thing too

so this is where small data (and anecdotes and narratives) meet big data

and its also why the butterfly's wingflap causing a  hurricane could be something we'd eventually model properly (after all, a trillion butterfly wingflaps happen every year without hurricanes, so its a matter of modeling the right butterfly, or the right Genghis Kahn).

I'm also pointed at Sandra Gonzalez Bailon's paper on this:
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2238198


I also like Kate Crawford's very nice talk on this topic ....

http://www.youtube.com/watch?v=irP5RCdpilc




Monday, March 18, 2013

from napster to friendster - its all still piracy - according to Jaron Lanier

Just reading Jaron Lanier's new tome, "Who Owns the Future", which is, unsurprisingly, pretty good - one nicely put argument is about money as information (and its transformation as a record of past work into a model of future promises) - but the more striking point for me, especially in view of recent arguments about privacy and micropayment systems for cloud (OSN) services instead of eyeball time and surveillance analytics, is that when
you download some music for free, your appropriation deprives the author and performer of potential future revenue, but when the OSN company decide they can monetize all your pictures, life story, and interests, this is no different in reality, yet if you do a lot of music/game/movie piracy, you will get in legal trouble, whereas when wholesale invasion of privacy and monetizing of your personal informational property occurs, the big corporate pirates are rewarded by Wall Street investment.....

note.. there is now some doubt being cast on what is being charged by these guys for your life - so this is interesting, as it calls into question the price we'd have to pay for a privacy preserving service (i.e. it isn't the revenue google and fb are currently making divided by the number of users, because they revenue reflects possibly absurd profit margins, which are completely unnecessary once we dispense with adverts and analytics - lets say it could be as little as 1/10th of their current revenue - that'd be peanuts


reference:-
paper estimating worthlessness of paying for boosting search rank result
plus press covverage



Wednesday, March 06, 2013

Robot Ethics

so there's a lot of guff about in recent techblogs droning on about robots (drones) and ethics

here's a very simple thought experiment which doesn't need Terminator/Skynet to present a dilemma Real Soon Now

Cars are being fittted with devices that detect if they are heading for an obstacle and actve the brakes automagically to safely stop.....

However, if not all cars have such a tech, then the car behind might re-end you

the front and rear impacts represent different risks (the crumple zones in a car are more designed to absorp impact in front, rather than rear)

so if you detect an obstacle ahead, and a car behind and (assuming cooperation) a car behind with, or without a robot safety braker...do you choose to brake less so that you amortize the impact over two cars?

so do you want a robot to act in the interests of ALL passengers in all vehicles, or "selfisly" on behalf of "this car only"???

I.e. if we design robot brakers according to Asimov's laws, do we want the 4th law as well as the usual 1st three
[see the 4th law for ethical synthetic humanism, #101]

Tuesday, February 19, 2013

CACM and the ACM Digital Library (&Open Access)

The february issue of the CACM contains a letter about proposals and trending ideas for ACM's digital library access in the current context where many academic/government research funding agencies are increasingly seeking free and open access to tax payers' funded work - the article skates over two possible  aspects - see Open Access position statement...

1. Some funding agencies will not pay for "gold" (author pays) - this is excluded if an institute has its own repository (e.g. Cambridge University has a copyright library and DSpace and other systems and has been around a bit longer than the ACM.....
2. Most of the DL content is authored, reviewed, edited, and even copies are hosted by ACM members - a very simple system would be to cover the costs of the DL completely from membership and conference/journal subscription income, but still allow 100% open access (open links and copies) - there's no evidence that systems that do this (Usenix, ArXiv, Wikipedia) fail. It behoves the ACM Publications Board to a) let us know what this would cost in terms of increases ACM and SIG membership annual fees and b) to estimate the impact (negative or possibly positive) on membership numbers....

Some SIGs (like SIGCOMM, since I was chair) have published fully open access to all the conference papers for a long time with no apparent negative impact on the SIG's membership or attendance at events (possibly positive, although that would be anecdotal only)

I think it would be interesting to canvas the entire ACM membership on this - there's a strong collegiate tradition in the ACM, and the idea of a "wholly membership owned not-for-profit free (green) open access digital library of the main repository of 75 years of Computer Science, would probably be very attractive to many people.

Looking at the rest of the issue, you can see many articles and papers of current interest that show what a lively and relevant organisation the ACM is and how timely much of the CACM material is - for example, earlier this week in Cambridge we had talks about dark silicon (and what ARM is doing about it), and lo and behold, there it is in the Research Highlights section, in the clearest and most accessible possible terms...and should be available to anyone in the world concerned with the future of our subject -

power, parallelism or reliability?

Monday, February 18, 2013

Mechanism Design, Incentives, and N-Sided Markets for N>2


I'd like to say that the Internet was born free, but everywhere is in
value-chains, but that would be naive and simplistic.

The purpose of this note is to elicit discussion and clarity on how we could
re-design existing systems in the Internet that use advertising to cover costs
(and make a profit) and replace the income by some other system. For example,
one could consider a system where people pay subscriptions (or even pay-per-use) to an
online service such as search, e-mail or social media. A simple economic
analysis would say that such a system could replace a 2-sided market (with the
service provider facing the customer for free, but advertisers for revenue -
i.e. a 3-body  system) with a single, simple market, where transparency,
competition and market efficiencies would find the right price.
Such a system would also not need to exploit side effects such as the
monetising users' personal data, to be viable and sustainable.

Such a view is naive in the extreme, and I have two reasons that I think why.

The first is micro-economic, and is about the value-chain of components in the
system. The second is macro-economic, and is more theoretical (and I'd like to
hear back from experts if it is technically correct).

Let's look at the first problem, which concerns the primary business of some of
the organisations involved in this complex world:

1. Let's think about a daily activity of a couple of billion of the worlds smart
phone carrying netizens, and see what stakeholders (at least in part) are involved.

Goods/Services -> Advertisers -> search/OSN -> ISP -> Cellular provider ->Handset -> Handset OS -> App/App SDK
                 <- analytics="" div="" nbsp="">
                                     v
                                 cloud infrastructure 

Obviously this is a hugely simplified picture. When you use an app on your
iDroid, (e.g. the HappyFrogs game), that you downloaded from an AppShop for
free, it presents you with adverts. These adverts come from the cloud and
provide revenue for the app implementer and the cloud provider. You probably
have a nearly-all-you-think-you-can-eat data plan with your cellular provider.
You probably also have a DSL line and WiFi at home - you might pay 30 zinglots
a month for the former and 10 for the latter. Occasionally (e..g once per
month), you get billed for going over your mobile data plan. Once in a while
(e.g once per year) you forget that while abroad, you might have to pay 50
zinglots to see a few adverts.

This is allegedly a 2-sided market. Obviously it isn't that simple since there
are payments for components in the system, and the people being paid are
incented to let you send/receive more data (to keep up with the latest speeds
and so on).  Hence as much as 30% of traffic to your phone maybe "unwanted"
adverts. And a lot of traffic from your phone is fed into analytics and sold as
market research.

Proponents of a subscription system for the services (and payment for games
either buying or renting) claim that this misguided incentive would go away and
the system would settle into some Utopic Adams/Hayak friendly perfect market.

This is pretty naive. The cost of the cellular system and its profits were
predicated on voice and SMS. (I'd note that when one of the largest companies
acquired its license for 3G spectrum, it paid 2-3% of the GDP of a large
European nation for it. It then wrote that down the next year. Spectrum
licenses of a decade or more are just another wrinkle in the picture).

So what you pay for your cell phone is a little high but not too much
(regulators in Europe have market tested the 3G providers and they are
competing - prices are decreasing, performance increasing - hey, they are even
competitive with fixed line broadband - there's some headroom in the backhaul
networks in some countries too, as the step-up increases in capacity sometimes
lead the increases in demand, although sometimes they lag too).
Another important facet of cellular is that, like the fixed line phone
companies before, the cell-phone net for voice is subject to lawful intercept
laws. The counter side is that people expect jolly-good-privacy. Cell phone
providers have not "failed" to deploy location based services - they have
actively avoided any service that might make them appear to their customers as
being Big Brother. Of course ,this laid them wide open to handset OS and App
vendors doing all sorts of cool location based services, but being in the
pocket of the user, the perception of Apple/Googledroid/Microbird's AGPS based checkin systems was as a friendly personal thing, not a cold brazen impersonal
corporate spy. So note that _privacy_ was a part of the value of the cellular
phone service providers business case.

The point here is that the cell phone business pre-dates and has businesses
which are separate from the 2-sided market of advert/game/user.

So the second outfit, the cloud service provider, also has other businesses.
Let's just think of search. Once upon a time there was google search. It beat
off Altavista by being better at stopping sites "gaming" the search ranking
algorithm. To do this, instead of pagerank, a whole slew of complex heuristics
had to be developed. In some sense, what Google (and any other eyeball catching
company) had to do was to predict what users really want. To do this, it has to
look at users AND in great detail at what they click on really, as opposed to
what order things show up on their search screen results. Bingo - click-on is
the true value. So now we have click-thru revenue from adverts for things and
auctions for positions of those things and rank order. A market in search, and
loss of data privacy.

So note that loss of privacy was also an essential side effect of a cloud
service provider's business case.

So in all this, the users personal data, the value of any and all the little
people's identity is a tiny tiny piece. Indeed, the additional price to be
extracted from users if we were to switch them to subscription
is a minuscule step up from what is currently being paid every month by them
(both phone/home side and big data/cloud side customers). It ain't worth
getting out of bed for. It doesn't matter how small you make the transaction
overheads either. Clearly they can be made zero given you already have user
accounts and payments for going over data plan volume caps etc etc...

2. N-Side markets, for N>2

Combinatorial Auctions, Collusion, and Confusion

My second argument is that we don't now how to do mechanism design
for a system of more than 3 customers. I claim this because I've read some of
the literature on combinatorial auctions, and see that the problem is
computationally hard. This means that all sorts of heuristics/approximations
and even machine learning have to be used to try and solve problems. However,
the key difficulty is that we don't have any algorithms that have explanatory
value. We don't know what the outcome will be and when we get it, we don't know
why we have it. This is no use for a regulator - this has already stalled 4G
spectrum Auctions in some countries where the set of potential licensees and the
government couldn't figure out whether a particular structure of auction would
lead to any companies buying (e.g. this bit of spectrum in these states, that
bit in those states) going bankrupt, or getting a white elephant or what have
you. This matters, as businesses like to plan.

I claim that even without the value chain flow of personal data from the user
edge into the analytics cloud, the system is a many-sided market and we cannot
propose any mechanism that would incrementally line up all the pieces, 
unless we have a flag day and say we declare all the systems to have to operate
transparently from a given instant, and how likely is that to happen.

We could hope for some simplifying step - for example, if we were to force some
of the components in the network to combine (e.g. cellular+ISP+cloud), then
there would be a 2-sided system which could be collapsed down easily through
incentive alignment (e.g .simple vickrey auction of service between customer
and advertiser would determine price of privacy:).


There are many cloud providers and many ISPs and Cellular providers. However,
some of them are already implicated - for example, google and apple both build
handset OSs, run online services, control AppShop developers, and to some
extent, dictate terms to cellular providers (at least implicitly). Perhaps a
gently nudge could get us the rest of the way there....surreally, t his might
be a Good Thing.

3. Unintended consequences.

There would be many. I leave it as an exercise for you, dear reader, to think
of as many as you like.


Saturday, February 16, 2013

Who Are We - Who am i++

So I've updated my reflections on both Eric Schmidt's visit (see the Humanitas video of Google Chief's talks in Cambridge early 2013) plus my take on some of the discussions at the Dagstuhl Seminar on Decentralised Privacy and OSNs (see Dagstuhl's excellent Website for more info there) and here's the document that resulted....

My cousin Antony (a trained ANthropologist and Computer Scientist!) pointed me at this interesting revival of Tarde's use of the Monad notion, which seems somewhat relevant!

Thursday, January 31, 2013

who am i

draft for later today....

Who am i ( & why do we care?)?

Who am i - a person - a behavour
        Jackie Chan, philospher, martial artist, buffoon
        character - despite no id, behaves morally
        a SET of relationships

The rise of the robots
        Golem & rabbi of prague
                on/off meth/emeth, peace/killer
        Frankenstein's monster - no soul, but moral
        Asimiv++ laws of robotics/robish - programming errors...and ais

Weapons and software - provenance and liability
        semtex (those bouncing czechs again) - watermark
                s/w - originators & users - can also watermark/fingerprint
        botnets for hire
                follow the money (click trajectory) - ok for crims
                mad people, however, -- follow god -- no money :) ?

Who cares?

Prurient public interest in celebs - minor celebs - ...
        who really was marilyn monroe...etc etc
        interrogators/torture/brainwashing
        porn

Commercial interest in little ole me?
        adverts/recommendations - targetted
        click through (click fraud - bots again:)
        analytics == market research

Government and big data...
        evidence based health, energy and other policies
        finding the bad guys (cliques in social graph)
        panopticon - mission creen

Is the past 12 years "typical" or just a brief mad interlude ?
        do we want to base a future on a small window
        think 1780 "buy canals, young boy"... :-)

Social
        Dunbar - not just a number (150:)
        Social Nets project::- (Oxford (parent of) & Cambridge:-)
        also a law - 3^6 - circles of trust
                theory of mind -> endorphins ->shared experience-> trust
                family, close friends, acquaintaces
                gossip
                forgetting
        Autism spectrum - are cloud companies just
                high functioning aspergers/stalkers

Tech change
        differential privacy
        homomorphic crypto
        privay preserving
                search,
                advertising,
                analytics
        offers poss. of users "giving" MORE data:)
        tbd

Policy change
        Privacy Law - Make It So!
        Only hold data that is pertinent, for so long as relevant

        Go further - don't hold data at all.
        I "hold' my data.
        I give you a capability to ask me for my data
        for so long as I allow...

        Audit trail tells me who looked at it when.

        Now no need for one single identity
        (which is an illusion anyhow)

        Me jon(a(than) = work, friends, close family
        My kids - two last names = parents, nationilty

        Future - same as past (but not present) -

        Exploit unique UK position
        -- 1 id per relationship
                bank, tescos, amazon, doctor, school work, friends, family
        with associated keys to data
        -- Data owned by me (replicated encrypted in a million clouds)
        -- no aggregation allowed by others (only me:)

Consequence of tech + law:-
        Allows +me+ to monetize my person

        Tell how much value my store loyalty card is worth

        Provenence - digital footprint/breadcrumbs
        can track s/w
        and robots (or more importantly their programmers or priests)
        and AIs too

Who pays
        I do - because its peanuts--

        Total facebook or google revenue/number of users

        Subscription instead of panopticon
        music high quality content already heading that way

        Note celebs (who am i) aren't on facebook...

        Do you really want to be low quality, marginal profit, product?

About Me

My Photo
misery me, there is a floccipaucinihilipilification (*) of chronsynclastic infundibuli in these parts and I must therefore refer you to frank zappa instead, and go home