1 Setup Linux Server.
Using CentOS4, hosted on a VMware virtual machine.
Note that the user name I will be using in this guide is simply ‘user’ and I will refer to this as either ‘user’ or ‘username’. Replace this with whatever the username was that you chose during installation.
1.1 Create default user
Get a root prompt on a command line:
e.g.
[root@server]cd /usr/sbin
[root@server]./useradd -m -d /home/user -p password username
1.2 Python2.5 installation
Ensure you have Python2.5 installed.
We used an older version of CentOS that did not have Python2.5 installed. If Python2.5 did not come installed with the os you are using, install Python2.5 from source. Make sure that you first install zlib so that you are able to configure it when installing Python.
(Optional) Download zlib from here
http://www.zlib.net
user@server # tar -zxvf zlib-1.2.3.tar.gz
user@server #./configure
user@server # make
user@server # su
root@server # make install
Download python2.5 from the following URL:
http://www.python.org/ftp/python/2.5.2/Python-2.5.2.tgz
Install Python2.5
user@server # cd /home/user
user@server # cd python-2.5.2
user@server # ./configure --with-zlib=/usr/local/include/ (this is the location of the zlib.h file)
user@server # make
user@server # su
root@server # make install
2 Set up networking and firewalls
2.1 IPtables
Edit the iptables file
[root@server] # vi /etc/sysconfig/iptables
Add the following lines:
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 5000 -j ACCEPT
2.2 Proxy
[root@server] # export HTTP_PROXY=hostname:PORT
3 Install all updates and get the basic applications we will need
To ensure that your machine is up-to-date, run the following:
[root@server]# yum update
[..... lots of lines of stuff ....]
Total download size: 12 M
Is this ok [y/N]: y
etc....
Hopefully, once those are installed, your machine will be up to date. Now to install on all the necessary packages:
3.1 Initial packages
[root@server]# yum install gcc
[root@server]# yum install python-devel
[root@server]# yum install mysql-server
[root@server]# yum install openssh-server
3.2 Install easy install
Download the ez_setup.py script from the following URL:
http://peak.telecommunity.com/DevCenter/EasyInstall
Install the ez_install.py script by running it as root:
[root@server]# wget http://peak.telecommunity.com/dist/ez_setup.py
[root@server]# python ez_install.py
If you receive the following error you may have to yum install zlib (see above).
File "ez_setup.py", line 267, in <module>
main(sys.argv[1:])
File "ez_setup.py", line 200, in main
from setuptools.command.easy_install import main
zipimport.ZipImportError: can't decompress data; zlib not available
3.3 Java
To install sun-java5-jdk , locate and download binary file for jdk via the following link:
https://cds.sun.com/is-bin/INTERSHOP.enfinity/WFS/CDS-CDS_Developer-Site/en_US/-/USD/ViewProductDetail-Start?ProductRef=jdk-1.5.0_15-oth-JPR@CDS-CDS_Developer
Place file jdk-1_5_0_15-linux-i586.bin in /usr/lib/jvm directory
[root@server]# chmod 755 jdk-1_5_0_15-linux-i586.bin
[root@server]# ./jdk-1_5_0_15-linux-i586.bin
Agree to licence that displays
Set system path to locate the java at: /usr/lib/jvm/jdk1.5.0_15
3.4 Install mysql-python
[root@server] yum install mysql-devel
[root@server] yum install zlib-devel
Download appropriate egg file at the following URL:
http://sourceforge.net/projects/mysql-python
We downloaded MySQL_python-1.2.2-py2.5-win32.egg
[root@server]# easy_install MySQL_python-1.2.2-py2.5-win32.egg
[root@server]# yum install sqlite-devel
[root@server]# easy_install pysqlite
[root@server]# easy_install rdflib==2.4.0
[root@server]# easy_install pylons
[root@server]# easy_install uuid
[root@server]# easy_install beautifulsoup
[root@server]# easy_install elementtree
4 Install some more python libraries
So, we need to install some python libraries for later, iCalendar format (vobject), OpenID consumer library (python-openid), and also install other miscellaneous things, such as a library that can generate UUIDs and a very good web framework called Pylons:
[root@server]# easy_install python-openid
[root@server]# easy_install uuid
Download python-dateutil-1.4.tar.bz2 from
http://labix.org/python-dateutil#head-2f49784d6b27bae60cde1cff6a535663cf87497b
[root@server]# tar -xvjf python-dateutil-1.4.tar.bz2
[root@server]# python2.5 setup.py install
[root@server]# easy_install vobject
[root@server]# easy_install pylons
5 Get Fedora-Commons and Apache Solr.
Download the Fedora and Solr packages:
[root@server]# su user
[user@server]$ cd /home/user
[user@server]$ wget http://downloads.sourceforge.net/fedora-commons/fedora-3.0b1-installer.jar
[user@server]$ wget http://apache.rmplc.co.uk/lucene/solr/1.2/apache-solr-1.2.0.tgz
6 Make the server environment ready for Fedora Commons
If you now list the home directory, you should see something like this:
[user@server]:~$ ls apache-solr-1.2.0.tgz fedora-3.0b1-installer.jar
We will need the following:
6.1 A directory to store Fedora’s root directory (config files, logs, libraries, and default Tomcat instance)
I chose to store the Fedora root directory at /opt/fedora30b1 –
[user@server]$ sudo -s
[root@server]# mkdir /opt/fedora30b1
Let the user own it: (Remember change ‘user’ to whatever your user is actually called!)
[root@server]# chown user:user /opt/fedora30b1
(Optional) And to aid upgrading, create a symlink at /opt/fedora to this folder:
[root@server]# ln -s /opt/fedora30b1 /opt/fedora
[root@server]# chown repuser:repuser -h /opt/fedora
Fedora needs certain environment variables to be set up now, FEDORA_HOME and JAVA_HOME at the very least. Open up the system wide profile (/etc/profile) and add them in there.
[root@server]# vi /etc/profile
And add the following lines to the end of the file (also, note that there *must not* be any gaps either side of the ‘=’ character)
# If you did not create the symlink, just point directly at your Fedora root
FEDORA_HOME=/opt/fedora30b1
# or if you did do the 'ln -s ...' step, use this instead:
FEDORA_HOME=/opt/fedora
export FEDORA_HOME
# If you did not create the symlink, just point directly at your tomcat root
CATALINA_HOME=/opt/fedora30b1/tomcat
# or if you did do the 'ln -s ...' step, use this instead:
CATALINA_HOME=/opt/fedora/tomcat
export CATALINA_HOME
JAVA_HOME=/usr/lib/jvm/jdk1.5.0_15
export JAVA_HOME
Save the file
Now, to check that this has worked, type the command ‘exit’ a few times to logout and then log back in again as your default user. If things have worked well, the following commands should work:
[user@server]$ echo $FEDORA_HOME /opt/fedora30b1
[Or ‘/opt/fedora’ depending on what you chose.]
[user@server]$ echo $JAVA_HOME
/usr/lib/jvm/jdk1.5.0_15
6.2 A mysql database and account for Fedora to use
Now to sort out MySQL. Remember that default root password you set for MySQL? You’ll need it now.
[user@server]$ mysql -uroot -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 10 Server version: 5.0.45-Debian_1ubuntu3.1-log Debian etch distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql>
Now issue the following commands:
mysql> create database fedora30; Query OK, 1 row affected (0.00 sec) mysql> grant all on fedora30.* to 'fedoraAdmin'@'localhost' identified by 'PUTYOURPASSWORDHERE'; Query OK, 0 rows affected (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) mysql> ALTER DATABASE fedora30 DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> ALTER DATABASE fedora30 DEFAULT COLLATE utf8_bin; Query OK, 1 row affected (0.00 sec) mysql> exit Bye [user@server]$
(NB You may or may not need to add the utf-8 configuration lines for your particular version of MySQL, but as far as I know, the commands are harmless if you don’t need them and utterly crucial if you do. Well, crucial unless you are dealing purely with ascii, but could you really guarantee that?)
7 Install Fedora commons 3.0b1
(Note – Official installation guide is here)
Go to the location where you saved the fedora installer, probably the user’s home directory and run the installer. I’ll include the entire installation dialog here. Where the response is blank, I simply pressed enter to accept the default.
[user@server]$ cd /home/user
[user@server]$ java -jar fedora-3.0b1-installer.jar *********************** Fedora Installation *********************** To install Fedora, please answer the following questions. Enter CANCEL at any time to abort the installation. Detailed installation instructions are available at: http://www.fedora.info/download/ Installation type ----------------- The 'quick' install is designed to get you up and running with Fedora as quickly and easily as possible. It will install Tomcat and an embedded version of the McKoi database. SSL support and XACML policy enforcement will be disabled. For more options, including the choice of hostname, ports, security, and databases, select 'custom'. To install only the Fedora client software, enter 'client'. Options : quick, custom, client Enter a value ==> custom Fedora home directory --------------------- This is the base directory for Fedora scripts, configuration files, etc. Enter the full path where you want to install these files. Enter a value [default is /opt/fedora] ==> Fedora administrator password ----------------------------- Enter the password to use for the Fedora administrator (fedoraAdmin) account. Enter a value ==> PUTTHEPASSWORDYOUDLIKEHERE Fedora server host ------------------ The host Fedora will be running on. If a hostname (e.g. www.example.com) is supplied, a lookup will be performed and the IP address of the host (not the host name) will be used in the default Fedora XACML policies. Enter a value [default is localhost] ==> Authentication requirement for API-A ------------------------------------ Fedora's management (API-M) interface always requires user authentication. Require user authentication for Fedora's access (API-A) interface? Options : true, false Enter a value [default is false] ==> SSL availability ---------------- Should Fedora be available via SSL? Note: this does not preclude regular HTTP access; it just indicates that it should be possible for Fedora to be accessed over SSL. Options : true, false Enter a value [default is true] ==> SSL required for API-A ---------------------- Should API-A be accessible exclusively via SSL? If true, requests to access API-A URLs will be automatically redirected to the secure port. Options : true, false Enter a value [default is false] ==> SSL required for API-M ---------------------- Should API-M be accessible exclusively via SSL? If true, requests to access API-M URLs will be automatically redirected to the secure port. Options : true, false Enter a value [default is true] ==> false Servlet engine -------------- Which servlet engine will Fedora be running in? Enter 'included' to use the bundled Tomcat 5.5.23 server. To use your own, existing installation of Tomcat, enter 'existingTomcat'. Enter 'other' to use a different servlet container. Options : included, existingTomcat, other Enter a value [default is included] ==> included Tomcat home directory --------------------- Please provide the full path to your existing Tomcat installation, or the path where you plan to install the bundled Tomcat. Enter a value [default is /opt/fedora/tomcat] ==> Tomcat HTTP port ---------------- Which HTTP port (non-SSL) should Tomcat listen on? This can be changed later in Tomcat's server.xml file. Enter a value [default is 8080] ==> Tomcat shutdown port -------------------- Which port should Tomcat use for shutting down? Make sure this doesn't conflict with an existing service. This can be changed later in Tomcat's server.xml file. Enter a value [default is 8005] ==> Tomcat Secure HTTP port ----------------------- Which port (SSL) should Tomcat listen on? This can be changed later in Tomcat's server.xml file. Enter a value [default is 8443] ==> Keystore file ------------- For SSL support, Tomcat requires a keystore file. If the keystore file is located in the default location expected by Tomcat (a file named .keystore in the user home directory under which Tomcat is running), enter 'default'. Otherwise, please enter the full path to your keystore file, or, enter 'included' to use the the sample, self-signed certificate) provided by the installer. For more information about the keystore file, please consult: http://tomcat.apache.org/tomcat-5.5-doc/ssl-howto.html. Enter a value ==> included Policy enforcement enabled -------------------------- Should XACML policy enforcement be enabled? Note: This will put a set of default security policies in play for your Fedora server. Options : true, false Enter a value [default is true] ==> false Enable Resource Index --------------------- Enable the Resource Index? Options : true, false Enter a value [default is false] ==> true Enable REST-API --------------- Enable the REST-API? The REST-API is an EXPERIMENTAL feature that exposes the Fedora API with a REST-style interface. In particular, URL endpoints should not be considered final, nor has policy enforcement been evaluated. For more information about the REST-API, see http://www.fedora.info/wiki/index.php/RESTful_Fedora_Proposal Options : true, false Enter a value [default is false] ==> true Database -------- Please select the database you will be using with Fedora. The supported databases are McKoi, MySQL, Oracle and Postgres. If you do not have a database ready for use by Fedora or would prefer to use the embedded version of McKoi bundled with Fedora, enter 'included'. Options : mckoi, mysql, oracle, postgresql, included Enter a value ==> mysql MySQL JDBC driver ----------------- You may either use the included JDBC driver or your own copy. Enter 'included' to use the included JDBC driver, or, enter the location (full path) of the driver. Enter a value [default is included] ==> Database username ----------------- Enter the database username Fedora will use to connect to the Fedora database. Enter a value ==> fedoraAdmin Database password ----------------- Enter the database password Fedora will use to connect to the Fedora database. Enter a value ==> PUTYOURDBPASSWORDHERE JDBC URL -------- Please enter the JDBC URL. Enter a value [default is jdbc:mysql://localhost/fedora30?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true] ==> JDBC DriverClass ---------------- Please enter the JDBC driver class. Enter a value [default is com.mysql.jdbc.Driver] ==> Successfully connected to MySQL Deploy local services and demos ------------------------------- Several sample back-end services are included with this distribution. These are required if you want to use the demonstration objects. If you'd like these to be automatically deployed, enter 'true'. Otherwise, the installer will put the files in your FEDORA_HOME/install directory in case you want to deploy them later. Options : true, false Enter a value [default is true] ==> Preparing FEDORA_HOME... Configuring fedora.fcfg Installing beSecurity Installing Tomcat... Preparing fedora.war... Processing web.xml Deploying fedora.war... Deploying fop.war... Deploying imagemanip.war... Deploying saxon.war... Deploying fedora-demo.war... Installation complete.
———————————————————————-
Before starting Fedora, please ensure that any required environment
variables are correctly defined
(e.g. FEDORA_HOME, JAVA_HOME, JAVA_OPTS, CATALINA_HOME).
For more information, please consult the Installation & Configuration
Guide, located online at
http://www.fedora.info/download/ or locally at
/opt/fedora/docs/userdocs/distribution/installation.html
———————————————————————-
And that should merrily go away and install and setup Fedora and the bundled Tomcat server for you. Unlike other services you may install, this won’t start the Fedora service, nor will it create a handy startup/shutdown script that integrates with you linux startup scripts in /etc/init.d. We will create one later on.
8 Further configuration of Fedora 3.0
!IMPORTANT! Fix the broken ‘mail.jar’ library! (Broken, as in the REST api will not work correctly with the version release in 3.0b1)
Get it from here: http://python-fedoracommons-webarchive.googlecode.com/files/mail.jar and use it to replace the mail.jar found in $FEDORA_HOME/tomcat/webapps/fedora/WEB-INF/lib/mail.jar. Restart Tomcat if you need to.
If you can’t find $FEDORA_HOME/tomcat/webapps/fedora/ go to http://localhost:8080/fedora to run the fedora.war file which creates the directory.
I am keen on UUIDs, and I cannot see a good reason for not using them. I suggest using the fedora id ‘namespace’ of uuid, so that a fedora URI will look like
<info:fedora/uuid:d3733f61-1083-4a3e-b914-5a853c42189b>
It is also trivial to generate these in python, consider the following code:
[user@server]$ python Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
>>> from uuid import uuid4
>>> uuid4().urn[4:] 'uuid:d3733f61-1083-4a3e-b914-5a853c42189b'
To get Fedora to accept these though, the ‘uuid’ namespace needs to be added to the retainPID region in fedora’s configuration file.
[user@server]$ vi /opt/fedora/server/config/fedora.fcfg
Press Ctrl-W and search for retainPID. Add in uuid to the list of namespaces (the ordering is not important):
<param name="retainPIDs" value="demo uuid test changeme ...
9 Installing Solr
[NB you will only have to follow the guide below, but here are the official docs, should you get in trouble
http://wiki.apache.org/solr/SolrInstall – Basic installation
http://wiki.apache.org/solr/SolrTomcat – Tomcat specific things to bear in mind]
Extract the whole archive somewhere on disc and you will see something like this in the apache-solr-1.2 folder:
~/apache-solr-1.2.0$ lsbuild.xml CHANGES.txt dist docs example KEYS.txt lib LICENSE.txt NOTICE.txt README.txt src ~/apache-solr-1.2.0$ ls distapache-solr-1.2.0.jar apache-solr-1.2.0.war
The easiest thing is to install Solr straight into the instance of Tomcat that Fedora has installed. One thing to be aware of is that search applications eat RAM and Heap for breakfast, so make sure you install it onto a server with plenty of RAM and it would be wise to increase the amount of Heap space available to the Tomcat instance. This can be done by making sure that the environment variable CATALINA_OPTS is set to “-Xmx512m”. This can be done inside the catalina.sh script in your /opt/fedora/tomcat/bin directory.
[i.e. just add CATALINA_OPTS=”-Xmx512m” at the beginning of the file if it doesn’t already exist.]
One final bit of advice before I point you at the rather good installation docs is that you might want to rename the .war file to match with the URL pathname you desire, as the guide relies on Tomcat automatically unpacking the archive:
So, a war called “apache-solr-1.2.0.war” will result in the final app being accessible at http://tomcat-hostname:8080/apache-solr-1.2.0/. We will rename ours when we copy it into Tomcat’s webapps directory.
Finally, Solr needs a place to keep its configuration files and its indexes. The indexes themselves have the capability to get huge (1Gb is not unheard of) and need somewhere to be stored. The documentation linked to below will refer to this location as ‘your solr home’ so it would be wise to make sure that this location has the space to expand. (NB this is not the directory inside Tomcat where the application was unbundled.)
So, let’s create a solr home in /opt as we did for fedora (NB change user):
[user@server]$ sudo -s
[root@server]# mkdir /opt/solr
[root@server]# chown user:user /opt/solr
Place the solr.war into Fedora’s Tomcat instance:
[root@server]# exit
[user@server]$ pwd /home/user/apache-solr-1.2.0
[user@server]$ cp dist/apache-solr-1.2.0.war $CATALINA_HOME/webapps/solr.war
Finally, we have to make sure a variable is available in Tomcat’s environment; the location of the Solr home directory. Remember that CATALINA_OPTS line we added before? Amend that now to look like:
(E.g. via vi $CATALINA_HOME/bin/catalina.sh )
CATALINA_OPTS=”-Xmx512m -Dsolr.solr.home=/opt/solr”
Now, as we will shape the Solr search service later on (i.e. choosing the fields to be indexed, and how to index them for faceted searching) we will just copy across the basic solr example, to make sure everything is running fine.
[Make sure you are in the unpacked solr directory:]
[user@server]$ pwd /home/user/apache-solr-1.2.0
[user@server]$ cp -a example/solr/* /opt/solr
[user@server]$ ls /opt/solr bin conf README.txt
10 Adding HTTP authentication to Solr update
First add a username/password to tomcat/conf/tomcat-users.xml:
<tomcat-users> ... <user username="solradmin" password="XXXXXXXX" roles="solradmin"> ... </user>
Then, in your Solr context, in tomcat/webapps/solr/WEB-INF/web.xml, add the following:
<web-app> .... usual stuff .... <security-constraint>
<web-resource-collection>
<web-resource-name>SolrUpdate</web-resource-name>
<url-pattern>/update/*</url-pattern>
</web-resource-collection>
<auth-constraint> <role-name>solradmin</role-name> </auth-constraint>
</security-constraint>
<!-- Define the Login Configuration for this Application -->
<login-config>
<auth-method>BASIC</auth_method>
<realm-name>Auth needed</realm-name>
</login-config> </web-app>
NB BASIC authentication sends the password over by plain-text, so this isn’t too great but is suitable for a localhost updater. Change this to DIGEST to increase the security, but bear in mind you may need to set the Realm for the Tomcat container and Digest hash mechanism (SHA1, MD5, etc)
(Some good guides to securing Tomcat services are but a Google search away – for example: http://www.unidata.ucar.edu/projects/THREDDS/tech/reference/TomcatSecurity.html )
11 Test your foundation
Now, we need to start up Fedora, and hopefully, it will all go smoothly:
[user@server]$ cd /opt/fedora/tomcat/bin/
[user@server]$ ./startup.sh
Using CATALINA_BASE: /opt/fedora/tomcat
Using CATALINA_HOME: /opt/fedora/tomcat
Using CATALINA_TMPDIR: /opt/fedora/tomcat/temp
Using JRE_HOME: /usr/lib/jvm/java-1.5.0-sun
Now try these links:
http://localhost:8080/fedora/search
http://localhost:8080/fedora/describe – make sure ‘uuid’ is one of the retainPIDs
http://localhost:8080/solr/admin Should look like a whole heap of options and bells and whistles.
Any 404 or 500 Server errors means that something has come unstuck.
12 Python-FedoraCommons-Webarchive
12.1 Installation
12.1.1 Download tar file
From http://code.google.com/p/python-fedoracommons-webarchive/downloads/list
Download and unarchive the archive_vXX.tar.gz
12.1.2 Schema file
If you are planning on using simple dublin core metadata as your core metadata vocabulary, there is a ‘default_solr_schema.xml’ in the archive root directory of the archive tar file.
The schema.xml will create fields for the default 14 terms.
The fields are directly linked to the dublin core names, e.g. dc:title -> ‘title’ field in Solr. Also, all fields have an additional facet field, which has a prefix of f – therefore, ‘title’ is the field to search, but ‘f_title’ is the field to draw facets from.
Restart Solr with its new schema.xml.
12.1.3 Edit Search Terms
user@server # cd into archive/lib/ folder.
Edit the ‘search_terms.py’ file to reflect the fields present in Solr.
This is currently a static hand-edited list, but it is a library function as I intend to ‘auto-magically’ derive the fields from the Solr instance itself, by downloading it’s schema, and parsing it.
12.1.4 Change access settings and passwords
Open lib/app_globals.py and lib/scan_changed_items_since.py for editing.
Change the URL and credentials for the Fedora Commons service and the Solr service present at the top of this file. (If your Solr service needs no password, just use ” strings)
Also, look for the setting of self.root. This is the host of your application, so change it from ‘http://localhost:5000/’ to whatever your server location is.
12.1.5 Start serving the application
Locate development.ini at the root of the archive directory
user@server# paster serve --reload development.ini
This should start serving the application, by using the settings present in development.ini (which has debugging turned on by default) and uses the –reload flag, which will quietly and automatically reload any pages, controllers or templates that it detects to have changed.
Should you want to change the port at which this application runs at, the .ini file contains a property near the beginning of the file, reading ‘port: 5000’. Change this to whichever port is needed, but bear in mind that low ports, such as port 80 and 443, are restricted and the server would need to by run with elevated permissions.
Need to make it https? Create or get a ssl.pem file (as you might for an Apache server) and add this to the root of the web application. Add into the .ini file, just under the ‘port’ declaration – ‘ssl_pem: name_of_pem_file.pem’
13 Harvest from Eprints
- Got to archive/helpful_scripts
- copy harvest_or08.py to archive/lib
- Modify fedora and Solr passwords in harvest_or08.py
- Set http_proxy
- Run python script
user@server# python harvest_or08.py EprintsURL/cgi/export_all?format=ResMapUrls
If script fails with following error: import error – cannot import name xpath try:
root@server# easy_install pyxml
- Once the script has harvested items into Fedora, run scan items script found in the lib directory:
user@server# python scan_changed_items_since.py
Pingback: Installer SVN et Trac sur un serveur dédié | LudiBlog