Software on Ponyland¶
Various categories of software are available on ponyland. There are the globally installed packages, automatically available for everyone, there was an extensive LaMachine installation (now deprecated) with a lot more software which you explicitly need to activate, there is the possibility to run your own containers, and there is extra software for which you also have to explicitly opt-in. The latter is organized in so called namespaces.
Globally installed¶
Amongst the globally installed packages are all common unix tools,
various compilers and interpreters. A full exhaustive list of all
installed packaged can be obtained through dpkg -l
. Below is a list of
the most notable software.
Experiment framework¶
- SLURM
- parallel
Compilers and interpreters¶
- gcc/g++ (5.4)
- python (2.7)
- python3 (3.5)
- perl (5.22)
- java/javac (Java 8, OpenJDK 1.8)
- R (3.2)
- matlab
- octave (open-source matlab alternative)
Research Tools¶
Version control¶
Using some form version control for maintaining your code is highly recommended!
Container platform¶
We offer Apptainer (formerly known as Singularity) as a container platform. It can interoperate with OCI/Docker containers. Please see containers_on_ponyland.
Editors¶
- vi/vim
- emacs
- joe
- nano/pico (easiest for beginners, but very few features)
- Graphical Editors (Use xforwarding)
- gedit
- Sublime Text
text (namespace texteditors
)
Other¶
- latex/pdflatex/tex
- gnuplot
- pdf2txt
- graphviz (dot/circo/etc)
- antiword
LaMachine¶
(Previously maintained by Maarten van Gompel. now deprecated)
NOTE: LAMACHINE IS DEPRECATED!!! - Depending on the particular software you are interested in, we direct you to use either your own Python virtual environments or Containers
See this comment for assistance to migrate from LaMachine to other solutions, depending on the underlying software you are looking for.
Opt-in: namespaces¶
Namespaces group certain software for which you have to explicitly
opt-in. This is done so the software may not cause any conflicts with
any versions of the same software you may have installed locally.
Setting a namespace means that your $PATH
, $LD_LIBRARY_PATH
,
$PYTHONPATH
and various other environment variables will be extended
to include software in the namespace. All namespaces are in
/vol/customopt/
.
To use a namespace, add either of the following to the bottom of the
file ~/.bash_profile
:
The initial .
is important (it is an alias to source
, but works in
all shells). Substitute $namespace
by the namespace, for example
uvt-ru
. You can add multiple lines with multiple namespaces.
The following namespaces exist and contain the software mentioned below:
1. kaldi¶
(Maintained by Maarten van Gompel)
. pathadd kaldi
All this does is set $KALDI_ROOT
to /vol/customopt/kaldi
, where
kaldi is installed globally for all to use (read-only).
2. machine-learning¶
(Maintained by admin, requested by Iris Hendrickx)
. pathadd machine-learning
Contains Machine Learning tools
- Glove
- LCS
- Maxent
- Paramsearch
- SVM
- svm_classify
- svm_learn
- Treekernels:
- svm_classify_tk
- svm_learn_tk
- SVMTool v 1.3.2
- Conditional random
fields
- crf_learn
- crf_test
- Mallet 2.07:
- mallet
- Ripper
- Weka 3.6.8
- weka (Use X-forwarding)
- Word2Vec
LCS quirks¶
The admin gets a lot of questions about the apparent non-functioning of LCS. It does work, but needs a number of requirements satisfied. Please read this before you ask for help:
- Java requires the option to present something visually (but does not use it). Make sure you have X-forwarding enabled, or run in headless mode (-Djava.awt.headless=true)
- LCS only seems to work in your home directory; not on network disks.
- LCS caches files. Remove the cache folder before each run.
- What LCS calls 'data' is where your store your models, what LCS calls 'files' is where your store your data.
- "Missing too many documents in index" can also refer to the fact that your models folder (called 'files' int he config file) or cache folder is not reachable by Winnow.
- When the test file has an incorrect format, LCS strangely complains about the training file being a directory.
- When you are using the incorrect version of Java (correct version is Java 6), LCS in some cases complains with the error 'comparisonmethod violates its general contract'.
- The details of the results seem to be influenced by the names of the training files.
3. SyntaxNet¶
Because SyntaxNet has so many dependencies, it did not fit into an existing namespace. Instead, it is a separate virtualenv that you can access by running this command:
source /vol/customopt/syntaxnet/bin/activate
4. alpino¶
(obsolete, use a container instead)
5. nlptools¶
(Maintained by Maarten van Gompel & admin)
. pathadd nlptools
Contains various 3rd party software for Natural Language Processing
- opennlp (java)
- Stanford Core NLP (java)
- Stanford NER (Named Entity Recognizer)
- Python binding for Stanford Core NLP
- FreeLing
- tdp (
tdparse
,tdptrain
) - Treetagger
- Ngram Statistics Package
- Gurobi
Stanford Ner:
# Use shortcut:
ner myfile.txt
# Or
cd /vol/customopt/nlptools/stanford-ner
java -mx600m edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier classifiers/english.all.3class.distsim.crf.ser.gz -textFile sample.txt
If you want to use Gurobi, please obtain your own Gurobi license by:
- Requesting one here: http://user.gurobi.com/download/licenses/free-academic
grbgetkey <YOUR_LICENSE_KEY>
- Save it to your homedir, and you should be ready to go!
6. texteditors¶
Maintained by admin
. pathadd texteditors
7. mongodb¶
Maintained by admin, requested by Ali Huerriyetoglu
. pathadd mongodb
mkdir ~/mongodb
PORT=3573 # Choose a port above 1024
mongod --dbpath ~/mongodb --port $PORT
See http://docs.mongodb.org/manual/reference/program/mongod/#bin.mongod for more options.
8. nodebox¶
Requested by Peter Berck
. pathadd nodebox
Python Library (Nodebox Linguistics Library)
- linguistics library (
import en
)
9. r4¶
. pathadd r412
R, software package for statistical computing. This updates the older, globally installed version (3.2) to R4.1.2, of November 2021.
10. python3¶
// (Obsolete, abandoned) //
11. python2-packages¶
//(No longer maintained, we recommend you use Python 3 instead (with LaMachine if needed)). Set up your own Python2.7 virtualenv otherwise //
12. python36¶
(Maintained by admin)
. pathadd python36
13. python38¶
(Maintained by admin)
. pathadd python38
14. redis¶
Requested by Emmanuel Chamilakis
https://redis.io version 5.0.7
Installation guide that was followed: https://techmonger.github.io/40/redis-without-root
. pathadd redis
PORT=7777 # Choose a port above 1024
MYREDISCONF=~/myredis.conf # Copy and change /vol/customopt/redis507/redis.conf
redis-server $MYREDISCONF --port $PORT
15. NodeJS¶
(Maintained by admin)
. pathadd nodejs
Current version: v18.18.2
15. SRILM¶
The SRI Language Modeling Toolkit, version 1.7.3.
. pathadd srilm
16. CmdStan¶
Software for Bayesian data analysis, version 2.34.1. Can be used from within R using the cmdstanr
package.
. pathadd cmdstan