Posts Tagged ‘PHP’

Give Me My Data upgrade: New API, authorization, and data formats

Monday, July 4th, 2011

No one would be surprised to learn that almost all of the user-generated content websites use our personal data to sell advertisements. In fact 97% of Google’s revenue comes from advertising.[1] That’s why it’s important these sites provide as much access as possible to the real owners of our data‐us. After all, we put it there and allow them to use it in exchange for the use of their software. Seems like a fair trade if you ask me.

A year and a half ago Facebook didn’t provide any access. That’s why I created Give Me My Data, to help users reclaim and reuse their personal data they put on Facebook.

By giving more agency to users of online systems, Give Me My Data may have already impacted the nature of online application development. In November 2010, almost a year after I launched Give Me My Data, Facebook created their own service for users to export their profile from Facebook as a series of HTML pages. Unlike Give Me My Data, the Facebook service doesn’t allow you to select which data you want or to choose custom formats to export. It also doesn’t give you options for visualization like the custom network graphs that Give Me My Data offers.

I believe their motivation originates in part with my application, likely due to the popularity of Give Me My Data, and points to the potential usefulness of similar apps. While years down the road may reveal many other online systems giving users control over their data, I see this as a positive effect where the content we create, as well as the means to share and manage it, are democratized.

Meanwhile, the above also keeps me hard at work developing the Give Me My Data project. This week I rewrote the program to use Facebook’s new OAuth authorization, which also required rewriting all of the code that fetches the data. Previously it used the REST API which is being deprecated (sometime?) in the future. I also added new data types, fixed the CSV format (which had the rows and columns mixed-up), and added the possibility to export in the JSON data format.

Finally, in the data selector, I distinguished standard data and customized data types. When I say customized, I mean that I’ve written code that mashes together more than one data table and/or addresses a specific question. For example, right now users can select from two types of network graphs and corresponding formats. One describes the user’s relationship to their friends, and the other describes the user’s relationship to their friends, as well as all their friends’ relationships to each other in various graph description languages. This is how I made the network graph image below. I’m also interested in hearing other suggestions for custom queries I might add. The project will be open source on Github soon, so even code contributions will be welcome.

Anyway, please try out the new version. You may have to delete the app from your allowed applications and then re-authorize it if you’ve used it before. As usual, you can provide feedback on the application page, and you can also contact me on Twitter via @givememydata.

[1] “Google Financial Tables for Quarter ending June 30, 2009” Retrieved October 13, 2010

Freedom for Our Files: Code and Slides

Monday, May 16th, 2011

A two-day workshop, with both technical hands-on and idea-driven components. Learn to scrape data and reuse public and private information by writing custom code and using the Facebook API. Additionally, we’ll converse and conceptualize ideas to reclaim our data literally and also imagine what is possible with our data once it is ours!

Here are the slides and some of the code samples from the Freedom for Our Files (FFOF) workshop I just did in Linz at Art Meets Radical Openness (LiWoLi 2011).

The first one is a basic scraping demo that uses “find-replace” parsing to change specific words (I’m including examples below the code)

<?php

/* Basic scraping demo with "find-replace" parsing
* Owen Mundy Copyright 2011 GNU/GPL */

$url = "http://www.bbc.co.uk/news/"; // 0. url to start with

$contents = file_get_contents($url); // 1. get contents of page in a string

// 2. search and replace contents
$contents = str_replace( // str_replace(search, replace, string)
"News",
"<b style='background:yellow; color:#000; padding:2px'>LIES</b>",
$contents);

print $contents; // 3. print result

?>

Basic scraping demo with “foreach” parsing

<?php

/* Basic scraping demo with "foreach" parsing
* Owen Mundy Copyright 2011 GNU/GPL */
 
$url = "http://www.bbc.co.uk/news/"; // 0. url to start with

$lines = file($url); // 1. get contents of url in an array

foreach ($lines as $line_num => $line) // 2. loop through each line in page
{
// 3. if opening string is found
if(strpos($line, '<h2 class="top-story-header ">'))
{
$get_content = true; // 4. we can start getting content
}

if($get_content == true)
{
$data .= $line . "\n"; // 5. then store content until closing string appears
}

if(strpos($line, "</h2>")) // 6. if closing HTML element found
{
$get_content = false; // 7. stop getting content
}
}

print $data; // 8. print result

?>

Basic scraping demo with “regex” parsing

<?php

/* Basic scraping demo with "regex" parsing
* Owen Mundy Copyright 2011 GNU/GPL */
 
$url = "http://www.bbc.co.uk/news/"; // 0. url to start with

$contents = file_get_contents($url); // 1. get contents of url in a string

// 2. match title
preg_match('/<title>(.*)<\/title>/i', $contents, $title);

print $title[1]; // 3. print result

?>

Basic scraping demo with “foreach” and “regex” parsing

<?php

/* Basic scraping demo with "foreach" and "regex" parsing
* Owen Mundy Copyright 2011 GNU/GPL */

// url to start
$url = "http://www.bbc.co.uk/news/";

// get contents of url in an array
$lines = file($url);

// look for the string
foreach ($lines as $line_num => $line)
{
// find opening string
if(strpos($line, '<h2 class="top-story-header ">'))
{
$get_content = true;
}

// if opening string is found
// then print content until closing string appears
if($get_content == true)
{
$data .= $line . "\n";
}

// closing string
if(strpos($line, "</h2>"))
{
$get_content = false;
}
}

// use regular expressions to extract only what we need...

// png, jpg, or gif inside a src="..." or src='...'
$pattern = "/src=[\"']?([^\"']?.*(png|jpg|gif))[\"']?/i";
preg_match_all($pattern, $data, $images);

// text from link
$pattern = "/(<a.*>)(\w.*)(<.*>)/ismU";
preg_match_all($pattern, $data, $text);

// link
$pattern = "/(href=[\"'])(.*?)([\"'])/i";
preg_match_all($pattern, $data, $link);

/*
// test if you like
print "<pre>";
print_r($images);
print_r($text);
print_r($link);
print "</pre>";
*/

?>

<html>
<head>
<style>
body { margin:0; }
.textblock { position:absolute; top:600px; left:0px; }
span { font:5.0em/1.0em Arial, Helvetica, sans-serif; line-height:normal;
background:url(trans.png); color:#fff; font-weight:bold; padding:5px }
a { text-decoration:none; color:#900 }
</style>
</head>
<body>
<img src="<?php print $images[1][0] ?>" height="100%"> </div>
<div class="textblock"><span><a href="<?php print "http://www.bbc.co.uk".$link[2][0] ?>"><?php print $text[2][0] ?></a></span><br>
</div>
</body>
</html>

And the example, which presents the same information in a new way…

Advanced scraping demo with “regex” parsing. Retrieves current weather in any city and colors the background accordingly. The math below for normalization could use some work.

<?php

/* Advanced scraping demo with "regex" parsing. Retrieves current
* weather in any city and colors the background accordingly.
* The math below for normalization could use some work.
* Owen Mundy Copyright 2011 GNU/GPL */

?>

<html>
<head>
<style>
body { margin:20; font:1.0em/1.4em Arial, Helvetica, sans-serif; }
.text { font:10.0em/1.0em Arial, Helvetica, sans-serif; color:#000; font-weight:bold; }
.navlist { list-style:none; margin:0; position:absolute; top:20px; left:200px }
.navlist li { float:left; margin-right:10px; }
</style>
</head>

<body onLoad="document.f.q.focus();">

<form method="GET" action="<?php print $_SERVER['PHP_SELF']; ?>" name="f">

<input type="text" name="q" value="<?php print $_GET['q'] ?>" />
<input type="submit" />

</form>

<ul class="navlist">
<li><a href="?q=anchorage+alaska">anchorage</a></li>
<li><a href="?q=toronto+canada">toronto</a></li>
<li><a href="?q=new+york+ny">nyc</a></li>
<li><a href="?q=london+uk">london</a></li>
<li><a href="?q=houston+texas">houston</a></li>
<li><a href="?q=linz+austria">linz</a></li>
<li><a href="?q=rome+italy">rome</a></li>
<li><a href="?q=cairo+egypt">cairo</a></li>
<li><a href="?q=new+delhi+india">new delhi</a></li>
<li><a href="?q=mars">mars</a></li>
</ul>

<?php

// make sure the form has been sent
if (isset($_GET['q']))
{
// get contents of url in an array
if ($str = file_get_contents('http://www.google.com/search?q=weather+in+'
. str_replace(" ","+",$_GET['q'])))
{

// use regular expressions to extract only what we need...

// 1, 2, or 3 digits followed by any version of the degree symbol
$pattern = "/[0-9]{1,3}[º°]C/";
// match the pattern with a C or with an F
if (preg_match_all($pattern, $str, $data) > 0)
{
$scale = "C";
}
else
{
$pattern = "/[0-9]{1,3}[º°]F/";
if (preg_match_all($pattern, $str, $data) > 0)
{
$scale = "F";
}
}

// remove html
$temp_str = strip_tags($data[0][0]);
// remove everything except numbers and points
$temp = ereg_replace("[^0-9..]", "", $temp_str);

if ($temp)
{

// what is the scale?
if ($scale == "C"){
// convert ºC to ºF
$tempc = $temp;
$tempf = ($temp*1.8)+32;
}
else if ($scale == "F")
{
// convert ºF to ºC
$tempc = ($temp-32)/1.8;
$tempf = $temp;
}
// normalize the number
$color = round($tempf/140,1)*10;
// cool -> warm
// scale -20 to: 120
$color_scale = array(
'0, 0,255',
'0,128,255',
'0,255,255',
'0,255,128',
'0,255,0',
'128,255,0',
'255,255,0',
'255,128,0',
'255, 0,0'
);

?>

<style> body { background:rgb(<?php print $color_scale[$color] ?>) }</style>
<div class="text"><?php print round($tempc,1) ."&deg;C " ?></div>
<?php print round($tempf,1) ?>&deg;F

<?php

}
else
{
print "city not found";
}
}
}
?>

</body>
</html>




For an xpath tutorial check this page.

For the next part of the workshop we used Give Me My Data to export our information from Facebook in order to revisualize it with Nodebox 1.0, a Python IDE similar to Processing.org. Here’s an example:

Update: Some user images from the workshop. Thanks all who joined!

Mutual friends (using Give Me My Data and Graphviz) by Rob Canning

identi.ca network output (starting from my username (claude) with depth 5, rendered to svg with ‘sfdp’ from graphviz) by Claude Heiland-Allen

Freedom for Our Files: Creative Reuse of Personal Data Workshop at Art Meets Radical Openness in Linz, Austria

Tuesday, May 10th, 2011

This weekend I am presenting a lecture about GIve Me My Data and conducting a two-day data-scraping workshop at Art Meets Radical Openness in Linz, Austria. Here are the details.

The Self-Indulgence of Closed Systems
May 13, 18:45 – 19:15
Part artist lecture, part historical context, Owen Mundy will discuss his Give Me My Data project within the contexts of the history of state surveillance apparatuses, digital media and dialogical art practices, and the ongoing contradiction of privacy and utility in new media.

Freedom for Our Files: Creative Reuse of Personal Data
May 13-14, 14:00 – 16:30
A two-day workshop, with both technical hands-on and idea-driven components. Learn to scrape data and reuse public and private information by writing custom code and using the Facebook API. Additionally, we’ll converse and conceptualize ideas to reclaim our data literally and also imagine what is possible with our data once it is ours! Register here


Art Meets Radical Openness (LiWoLi 2011),
Date: 12th – 14th May 2011
Location: Kunstuniversität Linz, Hauptplatz 8, 4020 Linz, Austria

Observing, comparing, reflecting, imitating, testing, combining

LiWoLi is an open lab and meeting spot for artists, developers and educators using and creating FLOSS (free/libre open source software) and Open Hardware in the artistic and cultural context. LiWoLi is all about sharing skills, code and knowledge within the public domain and discussing the challenges of open practice.

Keyword Intervention update

Sunday, October 24th, 2010

I launched Keyword Intervention in January 2007 and for almost four years now it has been scraping topical search terms and attracting random traffic. Today I moved the project to its own domain, keywordintervention.com and also updated the documentation on the site. Below is a sample of the last 500 search terms by users all around the world. The full list is here.

Automata: Counter-Surveillance in Public Space paper on the Public Interventions panel at ISEA2010

Saturday, August 7th, 2010

isea2010_logo_klein

ISEA2010 RUHR Conference in Dortmund, Germany

P26 Public Interventions
Tue 24 August 2010
15:00–16:30h
Volkshochschule Dortmund, S 137a
Moderated by Georg Dietzler (de)

  • 15:00h | Owen Mundy (us): Automata: Counter-Surveillance in Public Space
  • 15:20h | Christoph Brunner (ch/ca), Jonas Fritsch (dk): Balloons, Sweat and Technologies. Urban Interventions through Ephemeral Architectures
  • 15:40h | Georg Klein (de): Don’t Call It Art! On Artistic Strategies and Political Implications of Media Art in Public Space
  • 16:00h | Georg Dietzler (de): Radical Ecological Art and No Greenwash Exhibitions

About my talk:

Automata is the working title for a counter-surveillance internet bot that will record and display the mutually-beneficial interrelationships between institutions for higher learning, the global defense industry, and world militaries. Give Me My Data is a Facbook application that help users reclaim and reuse their Facebook data. The two projects, both ongoing, address important issues surounding contemporary forms of communication, surveillance, and control.

New Automata sitemaps

Sunday, July 4th, 2010

A deconstruction of defense contractor website data structures.

ga-asi.com_sitemap_20091208_red_800w

ga-asi.com_sitemap_20091208_red_detail

lockheedmartin.com_sitemap_20091214_red_800w

lockheedmartin.com_sitemap_20091214_red_detail

New yourarthere.net website is live

Saturday, May 22nd, 2010

After 4 months the new yourarthere.net website and member-run content management system is now live. Thanks to Braylin and Brittany Morales, Beth Lee, and Chris Cumbie for all their hard work.

The site is valid XHTML/CSS and runs on PHP/MySQL using the Codeigniter framework. All the details from our research from inception onward are archived here.

This site is based around the idea that members should have control of the content on the website. Every member has a profile where they can add images, text, tags, and events to promote their artwork or group. Members can create a new profile for every domain they host with yourarthere.nets.

Picture 6

Picture 8

Hello World

Monday, May 17th, 2010

Considering sharing the source code behind Give Me My Data on GitHub. Looks like its great for archiving, improving, and sharing…

“Facebook’s Disconnect: Open Doors, Closed Exits” – TechCrunch

Sunday, May 9th, 2010

Picture 1

More press for Give Me My Data, this time by Rohit Khare from TechCrunch (thanks for the note Evan.).

Give Me My Data has a more open-ended design that supports exploration and experimentation, in part because it sports an impressive array of formats to download your friend lists and other information for use in other projects such as visualization and charting. Owen Mundy at Florida State originally developed it for his own use, but “this week it kind of exploded because of the interface changes.” That could either be a sign of broader awareness of how much data users share with Facebook; or it could be the acute interest users have in putting profile data that Facebook “lost” right back onto Facebook (a feature that may be coming soon).”

“Two Facebook Apps To Help You Fight Back Against Facebook” – The Consumerist

Tuesday, May 4th, 2010

logo_theconsumerist

Two Facebook Apps To Help You Fight Back Against Facebook
by Chris Walters, The Consumerist, May 4, 2010