Archive for November, 2009

Java Imenu

Monday, November 30th, 2009

My patch to CC-mode, which I talk about here, was finally accepted and submitted. The final regex turns out to be:

 
(defvar cc-imenu-java-generic-expression
  `((nil
     ,(concat
       "[" c-alpha "_][\]\[." c-alnum "_<> ]+[ \t\n\r]+" ; type spec
       "\\([" c-alpha "_][" c-alnum "_]*\\)" ; method name
       "[ \t\n\r]*"
       ;; An argument list that is either empty or contains any number
       ;; of arguments.  An argument is any number of annotations
 
       ;; followed by a type spec followed by a word.  A word is an
       ;; identifier.  A type spec is an identifier, possibly followed
       ;; by < typespec > possibly followed by [].
       (concat "("
               "\\("
                 "[ \t\n\r]*"
		 "\\("
		   "@"
		   "[" c-alpha "_]"
		   "[" c-alnum "._]*"
		   "[ \t\n\r]+"
		 "\\)*"
		 "\\("
		   "[" c-alpha "_]"
		   "[\]\[" c-alnum "_.]*"
		   "\\("
		     "<"
		     "[ \t\n\r]*"
		     "[\]\[.," c-alnum "_<> \t\n\r]*"
		     ">"
		   "\\)?"
                   "\\(\\[\\]\\)?"
                   "[ \t\n\r]+"
                 "\\)"
                 "[" c-alpha "_]"
                 "[" c-alnum "_]*"
                 "[ \t\n\r,]*"
               "\\)*"
               ")"
               "[.," c-alnum " \t\n\r]*"
               "{"
               )) 1))
  "Imenu generic expression for Java mode.  See `imenu-generic-expression'.")

You can install this in your emacs initializations for now; it will be in the 23.2 release. I’m glad it finally made it in - it will make my personal customizations slightly shorter.

As a result of this patch, I was also asked to join the cc-mode project in order to bring their Java support up to 1.6 - it’s unfortunately lagged behind, this patch being one example. If you’ve had any similar issue, please let me know!

3B Courses

Thursday, November 26th, 2009

I decided to write a little bit about the courses I have this term. Overall, I’m pretty happy with them, although Graphics is the only one I have nothing to complain about. Since the term is nearly over, it seemed like a good idea to reflect on these.

Software Requirements and Specification - SE 457
This course wasn’t one I was particularly looking forward to, but it turned out to be better than I expected. Our prof turned out to be quite good, and while the material isn’t the most interesting I’ve ever learned, but it isn’t as boring as I feared. The assignments are unfortunately not that interesting and a lot of work; they involve reverse-engineering the graduate admissions system. I feel the course could be better if it was more integrated with our design project course. The assignments could be to gather requirements and document the design of our project, instead of something fairly useless.

Graphics - CS 488
This course has been a lot of fun so far, but also lives up to its reputation of being a lot of work. Our professor is good, so the lectures are pretty interesting. The assignments are difficult, starting from 3D tetris and going up to a raytracer, but they are not difficult to get good marks in if you just put in the work - the grading scheme is known and pretty objective. I wasn’t sure how I was going to like this course, not having any real graphics experience previously, but I’m glad I took it. However, I dislike the lack of midterm, since it means there is no indication of what the final will be like.

Introduction to Database Management - CS 348
This course hasn’t been particularly interesting so far. Our prof is decent, which helps, but the material is pretty basic. The assignments are fair, and were representative of what was on the midterm. I probably wouldn’t have taken this class if it weren’t required, but at least it’s not actively bad and is instead just boring.

Concurrency - CS 343
This course is a bit of a mixed bag. The prof is rather mediocre, and a lot of the value in the assignments is obscured by the use of uC++. For those of you that don’t know, uC++ is a C++ dialect that is used by essentially nobody. I really disagree with the use of uC++ for this; I think it should be taught in a higher-level language. As it is, the majority of the time in the assignments is spent debugging C++ issues instead of dealing with actual concurrency; even Java wuoldn’t have nearly as many problems in this. Since uC++ doesn’t use concurrency like any other language, the material from the assignments is also not particularly useful, and as long as this is going to be the case the assignments should be about concurrency, in my opinion. Some of the material is pretty interesting, as long as we are talking about the general form of the concurrency and not uC++, but this isn’t nearly as often as I’d like.

Formal Languages and Parsing - CS 462
The topics covered in this course are really interesting. I like a lot of the theoretical aspects of CS, and this course definitelly is on the very theoretically side. Unfortunately, the lecturer we have is fairly bad, which is disappointing as this was the class I was looking forward to most. The assignments have so far been pretty reasonable, although several of the problems are straight from the book. Again, the lack of midterm means there isn’t necessarily a good way to prepare for the final, but the final is take-home at least.

Design Project - SE 390
I’m still not entirely sure what I think about this course. The entire course revolves around a group project which we pick and continue developing for the next 3 school terms. This is a very cool idea, and a project of this length has brought up some issues that don’t generally come up in school assignments. The actual deliver ables for the project this term are very ill-defined, which is pretty annoying since we are getting an objective grade. The lectures, while entertaining, are also more or less useless - Paul Ward mostly just talks about whatever he wants. It seems a lot like an applied entrepreneurship course, which is neat, but I’m not sure it should be a required class.

On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N)

Monday, November 23rd, 2009

On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N)
Ingo Wald, Valstimil Havran

Kd-trees seem to be the most widely used technique for spatial subdivisioning. They are believed to be the best known method for fast ray tracing. The Surface Area Heuristic is the construction technique usually used, and how to building efficient ones and traversing them quickly is well understood. However, constructing them is not as well understood, which is what this paper hopes to analyze.

All kd-tree construction schemes follow the same recursive scene:

function RecBuild( triangles T, voxel V ) returns node
    if Terminate(T,V) then:
        return new leaf node( T )
    p = FindPlane( T, V )
    (V_L, V_R ) = Split V with p
    T_L = all members of T on one side of  V_L
    T_R = all members of T on the other side of V_L
    return new node( p, RecBuild(T_L, V_L), RecBuild(T_R, V_R) )

The difference between a good and a naively build kd-tree is often a factor of two or more. The naive method often used, the “spatial median” method, is to just alternate axis and the plane being split at the spatial median of the voxel.

The Surface Area Heuristic(SAH) tends to perform quite well. It makes several assumptions:

  • Rays are uniformly distributed, infinite lines.
  • The cost for both a traversal and a triangle intersection are known.
  • The cost of intersecting N triangles is roughly NK.

The expected cost for a given plane p then is one traversal step, plus the cost of intersecting the two children. Usually, instead of minimizing the cost of traversing the entire tree, a greedy version of the heuristic is used which minimizes it at each level, due to efficiency reasons. We also need to define a termination criterion; when to stop the recursive build: SAH gives us an elegant one, which is if the cost of this node as a leaf is better than even the best split.

The SAH tends to work best, and only a few modifications are known to consistently yield better improvements. One of these is to favor splits that cut off empty space. Requiring the tree to be split a certain number of times can prevent the tree from being stuck in a local minnima, but this number is hard to determine for general scenes.

There are several ways to construct Kd-trees once you have decided upon the cost function. The mmost trivial way is to iterate all triangles, determine their possible split candidates, and compare the cost for each one. This will end up taking O(N^2) operations, where N is the number of triangles in the scene. This is often too slow.

There is also a widely-known algorithm for constructing SAH trees in O(N log^2 N). I’m not going to summarize the algorithm here: if you are interested, read the paper. It essentially ’sweeps’ a plane, calculating costs incrementally. This is a huge improvement, but as the theoretical lower bound is O(N log N) we wish to be better.

This paper introduces a O(N log N) method of building a kd-tree. What caused the previous algorith to be O(N log^2 N ) instead of O(N log N ) is having to sort once per partition. This algorithm improves on that by sorting the evet list only once. This involves having to events for all dimensions. Again, I’m not going to recap specifics of the algorithm here.

The two algorithms were then compared in actual time, and not just asymptotic behavior. Realistic models of varying size were used. The O(N log N) variant is concisely faster than the O(N log^2 N) variant by a factor of 2-3.5. Analysis shows that the O(N log^2 N) variant spends more time re-sorting than the other variant uses to build the entire tree.

Supporting Dynamic Languages on the JVM

Thursday, November 19th, 2009

Supporting Dynamic Languages on the Java Virtual Machine
Author: Olin Shivers

This paper is about enhancements to the JVM that would make it easier to port dynamic languages, such as Scheme, to the JVM. This is desirable due to the fact that there is now a JVM for practically everything; running on the JVM means that the language will run on every architecture it will run on. However, the JVM, while well-designed for speed of Java programs, does not support dynamic languages very well. One example of a language that suffers performance problems from this porting process is Scheme. In Scheme, there must be a uniform representation of data - all types can be in cons cells. Working this into the Java class models means that every type must extend Object, and boxing and unboxing is expensive for primitive types.

The proposal to support immediates is to give pointers to Java objects a low bit of one. Since these are allocated on word boundaries, this does not hurt. This allows a final ImmediateDescriptor class that has 31 bits of state that can quickly be converted to an even integer(or vice versa). This causes no penalty to programs not using them - if a method iis called on them, this would generate a memory alignment exception that the VM can catch, which would still be fast.

There are still problems with issues such as method lookup, for example. The JVM bytecode is well-optimized for Java code, but not necessarily other paradigms. There is a tension in the bytecode between verification and efficiency - we don’t want an unsafe RISC bytecode system, we do want a safe system, but this will make it less efficient in few cases. This tradeoff is made well for Java, but not other languages.

A proposal to fix this is to have some of the bytecode instructions to be linked to C routines at runtime. This allows language implementers to efficiently represent whatever they need. However, this brings up the problem of verification - these C routines may be unsafe. The solution to this is to have some central body, where the JVM will ‘checkout’ the required instructions as told by the language implementer, and warn the user if the requested code is not in this standard and must be checked out from an unverified location.

Acer AspireONE and Dropbox

Monday, November 16th, 2009

My ASUS’ motherboard was recently blown, so I ended up getting a netbook to use until I get a more powerful laptop. Since I mostly use my laptop for development and web browsing, I don’t have a very pressing need for a hugely powerful comptuer anyway. I got an Acer ASPIREOne, which was one of the cheapest ones I could find on short notice.

I really should have learned this by now, but you should regularly back up all your files. I had most of my documents in SVN and Dropbox, but losing my current changes to my repositories is pretty annoying. I’m currently copying over files, since my hard drive is OK, but I htink I’m going to move as much as possible onto Dropbox.

Dropbox is a program that is used for backing up and synchronizing files across computers. It acts as just a regular folder on your computer, and when changes are made they are propogated up to Dropbox’s servers. It works offline and can be used with pretty much any OS. The first 2GB you use are free; if you need any more, they charge you.

The one problem I have so far with the netbook so far is the keyboard. It’s understandable that it’s small, but some of the keys are in strange position, makign it more annoying than it should be. For example, there are two ‘\’ keys, neither of which are in the correct position and both in spaces better used by ther keys. The placement of them makes the left shift key to be very small, and the ‘Enter’ button to not be as wide as I’m used to. I’m not sure that this was at all necessary.

Other than that, everything’s working pretty well. The screen is much smaller than I’m used to, which was expected, but is still somewhat disconcrting at first. If you have no reason to have a powerful laptop, I’d recommend just getting a cheap netbook and not spending a huge amount for a really good laptop.

The Clean Code Talks - Unit Testing

Thursday, November 12th, 2009

The Clean Code Talks - Unit Testing

This Google tech talk is about the benefit of unit tests. It starts out asking one fundamental question: how do you write hard to test code? Despite most developers(myself included) being good at writing hard-to-test code, most aren’t quite sure how to do it. The speaker describes some of these ways; mixing the ‘new’ operator in business logic, looking for things, doing work in the constructor, global state, and deep inheritance, among others. All of these make it hard to isolate specific parts of the code to test - in the deep inheritance case, you’re also testing superclasses; in the other cases, you can’t isolate one function from all the other functions. It’s also hard to test purely procedural code - there are no ’seams’ that can be exploited to isolate specific parts.

The speaker then describes the progression of levels of testing. The highest level are scenario tests. These test the whole application, doing something a user would do and ensuring the correct thing happens. These test the whole app, which makes them slow; if the test fails, it’s also hard to know precisely where the test failed; you generally have to trace through with a debugger in order to find the specific place.

The next level of testing are functional tests. These test specific subsystems, with simulators for external parts. These are much faster than scenario tests, and you have a better idea of what failed. However, you still don’t know precisely what went wrong - If you’re testing a radio and it doesn’t work, you don’t have much of an idea why. You still have to use a debugger to isolate the point of failure.

This leads to the bottom level of testing: unit tests. These test individual classes in isolation from one another. They are very fast - the speaker suggests running them whenever you save. If all your unit tests pass, you have a high confidence that the class in question is OK, even though you are unsure about the interaction between class. It’s also very easy to figure out what precisely went wrong.

The speaker stressed that this is a continuum of tests; you shouldn’t have only unit tests or only scenario tests - you should have both, although unit tests are more important. It’s also important to have ’seams’, where you can inject the class’s dependencies in order to mock out their behaviour. This is done using dependency injection, where a class asks for its dependencies in the constructor. Going back to before, a class should be either in the business of construction or finding objects, or doing actual work - it’s easy to test either of these methods, just hard to test a method that mixes the two.

Your Botnet is My Botnet: Analysis of a Botnet Takeover

Monday, November 9th, 2009

This is a summary of a paper, titled Your Botnet is My Botnet: Analysis of a Botnet Takeover.

Botnets are becoming a large problem for the internet. They are formed by networks of compromised computers that are under the control of some other person. Botnets are becoming the primary means for criminals to launch DOS attacks, steal personal data, or other cyber crimes.

Most previous analysis of botnets have been analyzing them from the inside; intentionally infecting a computer to join the botnet, and analyzing the activity that then occurs. Since many botnets use P2P protocols, other infected computers can be discovered using this technique. However, this gives a very limited view of the activities of the botnet. A better way is to take control of the entire botnet, which can be done either with cooperation from domain registrars or law enforcement.

For this paper, researchers took control of the Torpig botnet. Torpig is primarily associated with bank account and credit card theft. This was done by exploiting how the bots try to locate their command server. Each bot generates a list of domains to contact, and the first host that sends a reply identifying itself is considered genuine until the next domain generating phase. This allowed researchers to register domains the infested host would contact.

Torpig is distributed to it’s victims using Mebroot, a rootkit that replaces a system’s Master Boot Record. Victims are infected through vulnerable web sites being modified so that the victim’s browser requests Javascript, which then attempts several exploits. If any are successful, an installer for Mebroot is downloaded and executed. Mebroot does not perform other malicious attacks itself; It acts as a platform to install malicious modules. Mebroot contacts the C&C server every two hours to receive updates.

The C&C server distributed three modules, which comprise Torpig. These inject these DLLs into the file manager, Internet Explorer, Firefox, and other popular utilities, allowing it to inspect all data handled by these programs. Every twenty minutes, Torpig uploads new data to the command server. In reply to this, the C&C server can either respond with ‘ok’ or a configuration file used for configuration and parameters to perform phishing attacks. These attacks can gain data that would not otherwise be possible by passive monitoring. When the user goes to a site in the configuration file, they will instead be redirected to a site given by an injection server.

Taking over the botnet was fairly simple; domains were registered for a three-week period. Logs were collected from all network data, until a new torpig binary that changed the domain generation algorithm was installed through Mebroot. 70GB of data were collected during the 10-day period that the Torpig botnet was under control.

All bots communicate with the Torpig command server through HTTP Post requests. This requests contains all the collected dat, as well as information about the bot. There are 8 different types of data that Torpig sends out: Mailbox account, email, form data, HTTP account, FTP account, POP account, and Windows passwords.

Attempting to analyze the size of the botnet is somewhat difficult. It can’t be done by merely checking how many IPs connect to the C&C server vecause of NAT and DHCP. However, Torpig contains information for hardware configurations and a mostly-unique ID for each bot. This led to an estimated 182,914 bots in the Torpig botnet. Further analysis was done to find the number of security researchers and search engine bots to get a more accurate number. Security researchers could be found by checking the default hardware configurations of VMWare and other virtualization tools. This gave a final estimate of 182,800 bots. In contrast, the number of IPs connecting to the C&C server was an order of magnitude larger. In the ten days the botnet was taken over, 49,924 new hosts were infected, though there were large spikes on two days.

Torpig is crafted to retrieve information that can easily be monetized. In the ten days, Torpig obtained 8310 accounts at financial institutions. 1660 credit and debit card numbers were also obtained. By pricing these accounts, the estimated value from these ten days is between $83K and $8.3M. In addition to information retrieval, Torpig opens proxies that can be used for spam or other activities, and represents a great deal of bandwidth that can be used in a DOS attack. It logs all other datas, which represents a huge breach of privacy and can be used to look at all chat, email, and other messages sent.

Analysis of the passwords retrieved showed that most were not very high strength, and roughly 28% of users reuse their passwords. This is evidence that the reason these botnets so large is a cultural problem, of people not understanding the consequences of irresponsible computer use.

Coders at Work

Friday, November 6th, 2009

There’s been a lot of talk about Peter Seibel’s new book, Coders at Work, recently, so I decided to
read it as well. I’m definitely glad I did; It’s a very readable book, with some very good programmers and designers’ views on debugging, programing, and other technical topics. Among those interviewed are jwz, Peter Norvig, and of course Donald Knuth, among others. This book seems inspired by another one I’m reading, Masterminds of Programming, but I enjoyed this one a lot more.

The book consists of 17 interviews with different programmers in a wide variety of domains. There’s a lot to absorb, and it’s pretty instructive to see how many of these coders don’t rely on modern tools such as debuggers or IDEs. While the interviews are pretty organic and aren’t exactly alike, Siebel does ask all of them some of the same questions, and it’s nice to see that different approaches can work equally well. Each programmers approach to API design is another one of the big ones he tends to ask, which is what I’d consider one of the most important parts of being a programmer.

I’d definitely recommend reading this book - there is a lot of useful information to absorb from it. It’s hard to pinpoint specific lessons, but it should at least make you think about your methods and techniques. It’s a very readable book, enjoyable to read and easy to understand. I’m with Joel on this - you should definitely read this book.



Java and C++ Utilities

Wednesday, November 4th, 2009

I’ve been working on some utilities for coding in Java. JDE and CEDET ground my emacs to a halt the last time I tried them, so I wanted something lightweight. So far, I mostly have some functions for looking up documentation - including c++ documentation - that I store locally on my computer and keep in a repository, but I also have a few utilities for auto-importing Java classes.

The utilities to follow need these macros defined. I talked about them previously at:
http://nflath.com/2009/08/emacs-timing-and-upgrades/. They are utilities for generating functions that take arguments defaulting to word at point.

(defun my-fn (fn prompt)
  "When given a function taking one argument and applying a function to it, will use that function
   and default to the word at point, with a prompt including that word."
  (let ((default (current-word)))
    (let ((needle (read-string (concat prompt " <" default ">: "))))
      (if (equal needle "")
          (funcall fn default)
        (funcall fn needle)))))
 
(defmacro defun-my (name prompt &rest body)
  "Will define both a function and a my- version of the function,
which defaults to the word at point."
  `(progn
     (defun ,name (arg) ,@body)
     (defun ,(intern (concat "my-" (symbol-name name))) ()
       (interactive)
       (my-fn (quote ,name) ,prompt))))

These functions will be used in some of the later functions I wrote. These are used for caching large directory structures in a buffer to search for files instead of parsing the output of ‘find’ each time. Specifically, I use these to quickly look up which file I should be referencing to view documentation on Java and C++ classes. create-file-list will just create a list of files in the given buffer, and find-location-for-doc-from-buffer will return the full path of the matching html file you are searching for. Java-find-html-for-class is just a helper function that fills in the arguments for find-location-for-doc-from-buffer for Java buffers.

(defun create-file-list (directory buffer)
  "Creates the list of files in a directory"
  (save-window-excursion
    (let ((default-directory directory))
      (shell-command "find . " buffer)
      (switch-to-buffer buffer)
      (flush-lines "\.svn")
      (flush-lines "class-use"))))
 
(defun find-location-for-doc-from-buffer (arg buffer-name buffer-creation-fn begin)
  "Finds the file for a given documentation name in the buffer
that may be created with buffer-creation"
  (save-excursion
    (save-window-excursion
      (let ((doc-buffer (or (get-buffer buffer-name)
                            (funcall buffer-creation-fn))))
        (switch-to-buffer doc-buffer)
        (goto-char (point-min))
        (while (not (line-matches (concat "/" arg "\.html")))
          (search-forward arg))
        (concat begin
                (buffer-substring (1+ (line-beginning-position))
                                  (line-end-position)))))))

These next functions are used to look up documentation. my-java-describe-class will open up documentation for the input class file, whereas java-describe-variable will take a variable name and look backwards to it’s declaration and find documentation for that class. c-search-docs does something similar; it will prompt you for a keyword and see if anything in my c++ documentation matches it.

(defun-my java-describe-class "Open Javadoc for Class"
  "Loads javadoc for specified class in your browser."
  (interactive "MClass Name: ")
  (browse-url (java-find-html-for-class arg)))
 
(defun-my java-describe-variable "Open Javadoc for Variable"
  "Opens the javadoc for the variable at point, if possible."
  (interactive)
  (save-excursion
    (re-search-backward (concat "[ \t\n]"
                                "[A-Za-z]+"
                                "<[][A-Za-z0-9<>]*>"
                                "[ \t\n]"
                                arg))
    (forward-char)
    (java-describe-class (current-word))))
 
(defun-my c-search-docs "Documentation For"
  "Searches C++ Documentation for the requested term"
  (browse-url (find-location-for-doc-from-buffer
               arg
               "*C Documentation*"
               (lambda () (create-file-list "~/.emacs.d/documentation/c++/" "*C Documentation*"))
               "~/.emacs.d/documentation/c++/")))

Another task that I frequently have to do is fix imports in Java classes. Doing this manually is a huge pain, so I wrote a few functions to help. my-java-import-class will prompt for a class, look up it’s full package name, and add the import to the top of your file. Java-get-undefined-classes will run compile-command and parse the output to add all unimported classes. This needs java-undefined-symbol-regexp to be defined correctly, as well as compile-command to be set to something like ‘javac filename’.

(defun-my java-import-class "Import Class"
  "Adds an import statement for the class at point."
  (save-excursion
    (let ((my-retn-value nil))
      (let ((my-string (java-find-html-for-class arg)))
        (find-file my-string)
        (end-of-buffer)
        (re-search-backward "\"\\([A-Za-z0-9]+\\.\\)+[A-Za-z0-9]+ [ci][ln][at][se][sr]" )
        (let ((start (point)))
          (re-search-forward " " )
          (setq my-retn-value (substring (buffer-string) start (- (point) 2)))))
      (kill-buffer (current-buffer))
      (beginning-of-buffer)
      (re-search-forward "import " (point-max) t)
      (beginning-of-line)
      (when (looking-at "import")
        (end-of-line)
        (newline))
      (insert "import " my-retn-value ";\n" ))))
 
(defvar java-undefined-symbol-regexp "symbol  : class \\([A-Za-z0-9]*\\)")
(defun java-get-undefined-class-names ()
  (interactive)
  (save-window-excursion
    (remove-if
     #'not
     (remove-duplicates
      (mapcar (lambda (x)
                (if (string-match java-undefined-symbol-regexp x)
                    (match-string 1 x)))
              (split-string (shell-command-to-string compile-command) "\n *^\n")) :test #'string-equal))))
 
(defun java-import-undefined-classes ()
  (interactive)
  (save-window-excursion
    (mapc #'java-import-class (java-get-undefined-class-names))))

Installing Trac

Monday, November 2nd, 2009

As part of a group project I’ll be doing over the next few terms, I had to set up a few utilities - trac, mediawiki, and reviewboard. It took me a while to figure out how to install these properly on a shared server - none of the tutorials I saw were entirely correct - so I figured that I should write my own.

Trac is a issue-tracker system that integrates with your version control system so you can track bugs by commits. I haven’t used it extensively yet - I’ll probably do another post on usage once I’ve been using it more. The first thing you need to do to install trac is to install all the required packages. On Ubuntu, you can do this with:

sudo apt-get install apache2 libapache2-mod-python libapache2-svn python-setuptools subversion python-subversion

Next, you need to create a base ‘trac’ directory somewhere on your filesystem and allow the apache user www-data to be able to read and write to it. If you want it to be in /home/trac, for example, you can do the following:

sudo mkdir /home/trac
sudo chown www-data:www-data /home/trac

Now, to create an individual trac project you need to know two things; 1) the name of you project, which is your choice, and 2) the location of your source control repository. Trac supports several different source control systems; we’re using it with subversion. Answer all the questions it asks you and your project will be created.

Next, you need to create the apache configuration file for this. Assuming that you want trac to be accessed by trac.somedomain.com, create the file /etc/apache2/sites-available/trac.somedomain.com from the following template, changing the values in []:


        ServerAdmin [yourEmail]
        ServerName [trac.yourDomain.com]

        ErrorLog [errorLogFile]
        CustomLog [logFile] combined

        # Possible values include: debug, info, notice, warn, error, crit,
        # alert, emerg.
	LogLevel warn
	ServerSignature On

    
		SetHandler mod_python
		PythonInterpreter main_interpreter
		PythonHandler trac.web.modpython_frontend
		PythonOption TracEnvParentDir [tracParentDirectory (e.g /home/trac)]
                PythonOption TracUriRoot /
                PythonOption PYTHON_EGG_CACHE /tmp
    

        # use the following for one authorization for all projects
	# (names containing "-" are not detected):
	
            AuthType Basic
            AuthName "trac"
            AuthUserFile /var/svn/conf/svnusers
            Require valid-user
	

This requires users to be authenticated to access trac.yourdomain.com. To create an authentication file, you need to do the following:

cd /var/svn/conf/
htpasswd svnusers [username] [newpass]
htpasswd svnusers [username2] [newpass2]

Repeat this for all the users you wish to be able to authenticate. You also need to define a policy file in /var/svn/conf/svnpolicy. This file has the following format:

[project1:path]
user1 = rw
user2 = r
usrer3 = rw

[project2:path]
user1 = rw
user2 = rw
user3 = brw

The project name will be the name of your trac project; the path is probably /, unless you want some parts of your trac setup to be authenticated differently than others. This should set up the authentication fro trac.

After this, you need to symlink the file your created in sites-available to sites-enabled and reload and restart apache:

cd /etc/apache2/sites-enabled
ln -s ../sites-available/trac.yourdomain.com .
/etc/init.d/apache2 reload
/etc/init.d/apache2 restart

After this, the site should be accessible if you edit your hosts file to redirect to it by adding the following line to /etc/hosts:

[siteip] trac.yourdomain.com

However, you probably want to be able to access it from any computer without modifying your hosts file. To do this, you need to go into whatever DNS manager you use and add an entry for trac.yourdomain.com that points to your server.

This ended up being longer than I expected, so I’ll cover mediawiki and reviewboard in later posts. Let me know if anything here is incorrect or more should be covered.