How to install Scrapy with MacPorts (full version)
+
Here is a step-by-step explaining how I got Scrapy running on my MacBook Pro 10.5 using MacPorts to install Python and all required libraries (libxml2, libxsit, etc.). The following has been tested on two separate machines with Scrapy .10.
Many thanks to users here who shared some helpful amendments to the default installation guide. My original intention was to post this at stackoverflow, but their instructions discourage posting issues that have already been answered so here it is…
1. Install Xcode with options for command line development (a.k.a. “Unix Development”). This requires a free registration.
2. Install MacPorts
3. Confirm and update MacPorts
$ sudo port -v selfupdate
4. “Add the following to /opt/local/etc/macports/variants.conf to prevent downloading the entire unix library with the next commands”
+bash_completion +quartz +ssl +no_x11 +no_neon +no_tkinter +universal +libyaml -scientific
5. Install Python
$ sudo port install python26
If for any reason you forgot to add the above exceptions, then cancel the install and do a “clean” to delete all the intermediary files MacPorts created. Then edit the variants.conf file (above) and install Python.
$ sudo port clean python26
6. Change the reference to the new Python installation
If you type the following you will see a reference to the default installation of Python on MacOS 10.5 (Python2.5).
$ which python
You should see this
/usr/bin/python
To change this reference to the MacPorts installation, first install python_select
$ sudo port install python_select
Then use python_select to change the $ python reference to the Python version installed above.
$ sudo python_select python26
UPDATE 2011-12-07: python_select has been replaced by port select so…
To see the possible pythons run
port select --list python
From that list choose the one you want and change to it e.g.
sudo port select --set python python26
Now if you type
$ which python
You should see
/opt/local/bin/python
which is a symlink to
/opt/local/bin/python2.6
Typing the below will now launch the Python2.6 shell editor (ctl + d to exit)
$ python
7. Install required libraries for Scrapy
$ sudo port install py26-libxml2 py26-twisted py26-openssl
py26-simplejson
Other posts recommended installing py26-setuptools but it kept returning with with errors, so I skipped it.
8. “Test that the correct architectures are present:
$ file `which python`
The single quotes should be backticks, which should spit out (for intel macs running 10.5):”
/opt/local/bin/python: Mach-O universal binary with 2 architectures
/opt/local/bin/python (for architecture i386): Mach-O executable i386
/opt/local/bin/python (for architecture ppc7400): Mach-O executable ppc
9. Confirm libxml2 library is installed (those really are single quotes). If there are no errors it imported successfully.
$ python -c 'import libxml2'
10. Install Scrapy
$ sudo /opt/local/bin/easy_install-2.6 scrapy
11. Make the scrapy command available in the shell
$ sudo ln -s /opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin/scrapy /usr/local/bin/scrapy
One caveat for the above, on a fresh computer, you might not have a /usr/local/bin directory so you will need to create it before you can run the above to create the symlink.
$ sudo mkdir /usr/local/bin
13. Finally, type either of the following to confirm that Scrapy is indeed running on your system.
$ python scrapy
$ scrapy
A final final bit… I also installed ipython from Macports for use with Scrapy
sudo port install py26-ipython
Make a symbolic link
sudo ln -s /opt/local/bin/ipython-2.6 /usr/local/bin/ipython
An article on ipython
http://onlamp.com/pub/a/python/2005/01/27/ipython.html
ipython tutorial
http://ipython.scipy.org/doc/manual/html/interactive/tutorial.html
Give Me My Data Anleitung in Deutsch…
This lovely website has posted instructions in German for using Give Me My Data. Vielen Dank! English version
Was weiß Facebook über mich?, in Bild
The German newspaper, Bild published another article mentioning Give Me My Data today.
Was weiß Facebook über mich? or in English, Facebook knows what about me?
“About 500 million people worldwide use the social network Facebook to stay in touch with friends. In Germany, almost 9.8 million people are registered with Facebook. Many users are worried about their privacy. BILD.de answered the important questions…”
The Difference Between Then and Now
TINA-B Festival
Nostic Palace, Czech Ministry of Culture, Prague
October 7-24th, 2010
In the October 2010 TINA-B Contemporary Art Festival in Prague, Owen Mundy and Joelle Dietrick will re-stage their 2006 project The Darkest Hour is Just Before Dawn. Originally developed in York, Alabama, USA, Owen Mundy and Joelle Dietrick borrowed lamps from the residents and installed them in an abandoned grocery store. Each lamp was set to turn on every night, and because of the inexactitude of the timers chosen, did so in an organic fashion, one by one, reflecting not only the participants in the community, but also the history of Alabama’s social movements. In an area where a nearby hazardous waste landfill caused the water undrinkable, the artists and the community collectively revived the vacant commercial space, removing roomfuls of damaged post-Katrina FEMA water boxes and transforming the downtown with the lamps, pulsing at their own pace, human in the imperfections and variety, and more powerful as a collection.
As if a scientific study with controls, the re-staging of the project in Prague and Venice studies the nature of site-specific and community-based art. Both cities provide unusual cross-cultural comparisons about domestic settings and the cultural, geographical and political structures that affect private space. The 2006 installation developed before the U.S. housing crisis, and these 2010 installations will develop as the global economy still recovers from the impact of the current economic downturn. In this context, the simple gesture of gathering of everyday objects and spaces can yield unusual insights into common assumptions about micro-macros shifts—the individual and the state, private spaces and public concerns, local and global.
Setup Macports Python and Scrapy successfully
“Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.”
But, it can be a little tricky to get running…
Attempting to install Scrapy on my MBP with the help of this post I kept running into errors with the libxml and libxslt libraries using the Scrapy documentation.
I wanted to try to let Macports manage all the libraries but I had trouble with it referencing the wrong installation of Python. I began with three installs:
- The default Apple Python 2.5.1 located at: /usr/bin/python
- A previous version I had installed located: /Library/Frameworks/Python.framework/Versions/2.7
- And a Macport version located: /opt/local/bin/python2.6
My trouble was that:
$ python
would always default to the 2.7 when I needed it to use the Macports version. The following did not help:
$ sudo python_select python26
I even removed the 2.7 version which caused only an error.
I figured out I needed to change the default path to the Macports version using the following:
$ PATH=$PATH\:/opt/local/bin ; export PATH
And then reinitiate the ports, etc.
Finally, I was not able to reference the scrapy-ctl.py file by default through these instructions so I had to reference the scrapy-ctl.py file directly
/opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin/scrapy-ctl.py
UPDATE
A quick addendum to this post with instructions to create the link, found on the Scrapy site (#2 and #3).
Starting with #2, “Add Scrapy to your Python Path”
sudo ln -s /opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin/scrapy-ctl.py /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/scrapy
And #3, “Make the scrapy command available”
sudo ln -s /opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin/scrapy-ctl.py /usr/local/bin/scrapy
Reading list for August 2010
About to embark on some new projects here in Berlin. Here’s my reading list at the moment…
Free: The Future of a Radical Price
by Chris Anderson
July 7th 2009 by Hyperion
Traditional economics operates under fundamental assumptions of scarcity–there’s only so much oil, iron, and gold in the world. But the online economy is built upon three cornerstones: processing power, hard drive storage, and bandwidth–and the costs of all these elements are trending toward zero at an incredible rate.
The Exploit: A Theory of Networks
by Alexander R. Galloway, Eugene Thacker
October 1st 2007 by Univ Of Minnesota Press
“The Exploit is that rare thing: a book with a clear grasp of how networks operate that also understands the political implications of this emerging form of power. It cuts through the nonsense about how ‘free’ and ‘democratic’ networks supposedly are, and it offers a rich analysis of how network protocols create a new kind of control. Essential reading for all theorists, artists, activists, techheads, and hackers of the Net.” —McKenzie Wark, author of A Hacker Manifesto
Group Work
by Temporary Services
New York, NY: Printed Matter. 2007
Based on a pamphlet published by Temporary Services in 2002 titled Group Work: A Compilation of Quotes About Collaboration from a Variety of Sources and Practices, this publication provides a multitude of perspectives on the theme of Group Work by practitioners of artistic group practice from 1960s to the present.
How to easily set up a campaign finance database (well, kind of) or Make Python work with MAMP via MySQLdb
I’ve been trying for a few hours to run a Python script from The Sunlight Foundation Labs which downloads (and updates) a campaign finance database from the Center for Responsive Politics. See their original post for more information.
In the process of getting this working I accidentally broke a working copy of MySQL and overwrote a database installed on my MBP (which I had stupidly not backed-up since last year). FYI, you can rebuild any MySQL database with the original .frm, .MYD, and .MYI files if you 1. Recreate the database in the new install of MySQL and 2. Drag the files into the mysql data folder.
I struggled quite a bit getting Python to work with MySQL via MySQLdb. I’m documenting some of the headaches and resolutions here in case they are useful. I’ve tried to include error messages for searches as well.
The Sunlight Foundation instructions require Python and MySQL, but don’t mention you have to have already wrestled with the madness involved in installing Django on your machine. Here is what I did to get it working on my MacBook Pro Intel Core 2 Duo. I’ve included their original instructions with my own (and a host of others).
Instructions
- Install MAMP.
While I had working installations of MySQL and Python (via installers on respective sites), I couldn’t get Python to connect to MySQL via MySQLdb. I decided to download and try MAMP for a clean start.
- Install XCode
Past installs are available on Apple Developer website.
- Install setuptools
Required for the MySQLdb driver. Remove the .sh extension from the filename (setuptools-0.6c11-py2.7.egg.sh) and in a shell:
~$ chmod +x setuptools-0.6c11-py2.7.egg
~$ ./setuptools-0.6c11-py2.7.egg - Install the MySQLdb driver
After downloading and unzipping, from the directory:
~$ python setup.py build
~$ sudo python setup.py installContinue following the advice of this post to the end How to install Django with MySQL on Mac OS X.
I also followed another piece of advice in Python MySQL on a Mac with MAMP to change the mysql_config.path from:
/usr/local/mysql/bin/mysql_config
to
/Applications/MAMP/Library/bin/mysql_config
Especially useful is his test script for making sure that Python is indeed accessing MySQL.
- Create a symbolic link between Python and MySQL in MAMP
This is required in order to use a socket to connect to the MySQL. See How to install MySQLdb on Leopard with MAMP for more information.
~$ sudo ln -s /Applications/MAMP/tmp/mysql/mysql.sock /tmp/mysql.sock
- Create a directory and put the two Python files in it.
- Modify the top of the sun_crp.py file to set certain parameters–your login credentials for the CRP download site and your MySQL database information.
- Install pyExcelerator
Error:
ImportError: No module named pyExcelerator
I had to install this module next.
- Comment out multiple lines
Error:
NameError: name 'BaseCommand' is not defined
In download.py comment out the following:
The line:
from django.core.management.base import BaseCommand, CommandError
Everything from
class CRPDownloadCommand(BaseCommand):
to the end of the document. - From the command line, run the script by typing, from the proper directory: Python sun-crp.py.
- It will take several hours to download and extract the data, especially the first time it’s run. But after that, you’re good to go.
Automata: Counter-Surveillance in Public Space paper on the Public Interventions panel at ISEA2010
ISEA2010 RUHR Conference in Dortmund, Germany
P26 Public Interventions
Tue 24 August 2010
15:00–16:30h
Volkshochschule Dortmund, S 137a
Moderated by Georg Dietzler (de)
- 15:00h | Owen Mundy (us): Automata: Counter-Surveillance in Public Space
- 15:20h | Christoph Brunner (ch/ca), Jonas Fritsch (dk): Balloons, Sweat and Technologies. Urban Interventions through Ephemeral Architectures
- 15:40h | Georg Klein (de): Don’t Call It Art! On Artistic Strategies and Political Implications of Media Art in Public Space
- 16:00h | Georg Dietzler (de): Radical Ecological Art and No Greenwash Exhibitions
About my talk:
Automata is the working title for a counter-surveillance internet bot that will record and display the mutually-beneficial interrelationships between institutions for higher learning, the global defense industry, and world militaries. Give Me My Data is a Facbook application that help users reclaim and reuse their Facebook data. The two projects, both ongoing, address important issues surounding contemporary forms of communication, surveillance, and control.
Recent and ongoing projects
Howdy, it’s been awhile since I last shared news about recent and ongoing projects. Here goes.
1. You Never Close Your Eyes Anymore
You Never Close Your Eyes Anymore is an installation that projects moving US Geological Survey (USGS) satellite images using handmade kinetic projection devices.
Each device hangs from the ceiling and uses electronic components to rotate strips of satellite images on transparency in front of an LED light source. They are constructed with found materials like camera lenses and consumer by-products and mimic remote sensing devices, bomb sights, and cameras in Unmanned Aerial Vehicles.
The installation includes altered images from various forms of lens-based analysis on a micro and macro scale; land masses, ice sheets, and images of retinas, printed on reflective silver film.
On display now until July 31 at AC Institute 547 W. 27th St, 5th Floor
Hours: Wed., Fri. & Sat.: 1-6pm, Thurs.: 1-8pm
New video by Asa Gauen and images
http://owenmundy.com/site/close_your_eyes
2. Images and video documentation of You Never Close Your Eyes Anymore will also be included in an upcoming Routledge publication and website:
Reframing Photography: Theory and Practice
by Rebekah Modrak, Bill Anthes
ISBN: 978-0-415-77920-3
Publish Date: November 16th 2010
http://www.routledge.com/books/details/9780415779203/
3. Give Me My Data launch
Give Me My Data is a Facebook application designed to give users the ability to export their data out of Facebook for any purpose they see fit. This could include making artwork, archiving and deleting your account, or circumventing the interface Facebook provides. Data can be exported in CSV, XML, and other common formats. Give Me My Data is currently in public-beta.
Website
http://givememydata.com/
Facebook application
http://apps.facebook.com/give_me_my_data/
4. Give Me My Data was also covered recently by the New York Times, BBC, TechCrunch, and others:
Facebook App Brings Back Data by Riva Richmond, New York Times, May 1, 2010
http://gadgetwise.blogs.nytimes.com/2010/05/01/facebook-app-brings-back-data/
5. yourarthere.net launch
A major server and website upgrade to the yourarthere.net web-hosting co-op for artists and creatives. The new site allows members of the community to create profiles and post images, tags, biography, and events. In addition to the community aspect, yourarthere.net is still the best deal going for hosting your artist website.
Website
http://yourarthere.net
More images
http://owenmundy.com/site/design_yourarthere_net
6. The Americans
The Americans is currently on view at the Northwest Florida State College in Niceville, FL. It features a new work with the same title.
More images
http://owenmundy.com/site/the-americans
7. Your Art Here billboard hanger
I recently designed a new billboard hanging device and installed it in downtown Bloomington, IN with the help of my brother Reed, and wife Joelle Dietrick.
Stay tuned here for news about Your Art Here and the new billboard by Joelle Dietrick.
http://www.facebook.com/pages/Your-Art-Here/112561318756736
8. Finally, moving to Berlin for a year on a DAAD fellowship to work on some ongoing projects, including Automata.
More images
https://owenmundy.com/blog/2010/07/new-automata-sitemaps/
I’ll be giving a paper about Automata at the upcoming ISEA2010 conference in Ruhr, Germany.
http://www.isea2010ruhr.org/conference/tuesday-24-august-2010-dortmund
Many thanks to Chris Csikszentmihályi, Director of the Center for Future Civic Media http://civic.mit.edu/ , for inviting me to the MIT Media Lab last August to discuss the project with his Computing Culture Group: http://compcult.wordpress.com/
You must be logged in to post a comment.