[Gimp-web] What should we do with /admin/gimp-web-urls?
Raphaël Quinet
raphael at gimp.org
Wed Mar 28 07:08:36 PDT 2007
Hi,
I noticed that some of the changes made to the web site (not only in
the last weeks) have included direct links to some other pages instead
of using the attribute-rewriting scheme that was created for this site,
with the list of substitutions defined in /admin/gimp-web-urls. For
example, there are some links with href="/unix/" instead of
href="wgo:unix".
Then I started wondering to what extent the "wgo:" substitutions were
used for internal links (inside www.gimp.org - wgo). To be really
useful, these substitutions should be used more than twice, otherwise
the overhead of having to edit two separate files (the page that you
are editing + the list of links) negates the advantage of having a
centralized list of links. I knew that some substitutions such as
"wgo:unix", "wgo:windows" or the e-mail addresses of various tutorial
authors were used in a couple of places, but I did not have a good
picture for the majority of these substitutions.
So I decided to write a script that parses all *.htrw files in the
tree and checks how many time each substitution is used. I also
wanted to know if there was a significant difference between the
"internal" links (with the "wgo:" prefix) and the other links. After
a couple of hours, I ended up with a script that produced the
following output:
Parsed 116 htrw files.
Distribution of 408 substitutions:
- used 0 times: 121 substitutions (29.7%)
- used 1 times: 204 substitutions (50.0%)
- used 2 times: 50 substitutions (12.3%)
- used 3 times: 13 substitutions ( 3.2%)
- used 4 times: 9 substitutions ( 2.2%)
- used 5 times: 3 substitutions ( 0.7%)
- used 6 times: 1 substitutions ( 0.2%)
- used 7 times: 3 substitutions ( 0.7%)
- used 9 times: 1 substitutions ( 0.2%)
- used 11 times: 1 substitutions ( 0.2%)
- used 16 times: 1 substitutions ( 0.2%)
- used 22 times: 1 substitutions ( 0.2%)
Distribution of 171 internal substitutions (with prefix "wgo"):
- used 0 times: 42 wgo substitutions (24.6%)
- used 1 times: 90 wgo substitutions (52.6%)
- used 2 times: 26 wgo substitutions (15.2%)
- used 3 times: 7 wgo substitutions ( 4.1%)
- used 4 times: 3 wgo substitutions ( 1.8%)
- used 5 times: 2 wgo substitutions ( 1.2%)
- used 7 times: 1 wgo substitutions ( 0.6%)
I was surprised by the results and I thought that my script had some
bugs and did not count correctly, but after double-checking the
results, I could confirm that they were correct. I did not expect
that we had so many unused substitutions. Also, both for the total
and for the internal links only, we have 92% of the substitutions
that are never used or used only once or twice in the whole site.
92% is much more than I expected! This leaves only 8% of the
substitutions that are used more than twice and provide real benefits
compared to having to maintain the links separately. For the
majority of the links on www.gimp.org, the attribute-rewriting system
is a net loss. In addition, the substitutions that are used the most
are also the ones that are the least likely to change:
- Among the internal links, the most used ones are "wgo:dev" (7),
"wgo:dev-news" (5), "wgo:unix" (5), "wgo:bugs" (4),
"wgo:mail_lists" (4) and "wgo:unix-howtos-tile_cache" (4). Most of
these are stable links, unlikely to change often.
- Among the external links, the most used ones come from the
tutorials because each tutorial includes two links to the e-mail
address of the author and one link to their page. So the one that
is used 22 times is "mail:People-Jeschke_Eric_R" and this matches
the 11 occurences of "home:People-Jeschke_Eric_R" because Eric
is the author of 11 photo-related tutorials.
This attribute-rewriting system is the main reason why we have a
build system for the web site and why we are editing *.htrw files
instead of editing *.html pages directly. Considering the surprising
statistics of usage, it looks like this system does not really bring
the expected benefits. I think that it was a good idea, but it did
not really work in practice.
If anybody bothered reading this far, I would like to ask for opinions
about what to do with /admin/gimp-web-urls. I know that "if it's not
broken, don't fix it" but in this case, the system that was designed
to simplify the maintainance of the web site is making it more complex
and it is even slightly broken: for example, the Bugzilla howto
includes broken links to "wgo:wingimp" (should be "wgo:windows") and
"wgo:gimp-unix" (should be "wgo:unix") that had not been detected
until I wrote this script. I also know that after the previous
overhaul of the web site in 2003, Carol accused me of "destroying
stuff" when I wanted to simplify things. So this time, I want to
check what everybody thinks before starting or planning some actions...
-Raphaël
More information about the Gimp-web
mailing list