snapsvg

2014-12-17

Day 17: A complex and detailed investigation into the various merits and faults of the assorted combinations of codepage, character set and byte encoding of human-readable text.

There are 127 characters in ASCII and tens of thousands of characters in the real world. It is probably an interesting debate, trying to come up with the most efficient way of encoding non-ASCII characters without screwing everything up.

Don't waste your time. Use UTF-8 and Unicode.

"But what about UTF-16?" No.

"But what about--" NO.

ASCII is included in UTF-8 Unicode. So is everything else. Everyone understands it, everything's assuming it, and all the other encodings and charsets are more obscure and therefore harder to deal with.

Everyone (except PHP) has UTF-8 Unicode built in to whatever programming language they're using.

Unless you're writing for devices with memory measured in bytes and a network connection measured in baud then you have time and space to use the bloating of UTF-8 Unicode. So suck it up, be inefficient, and accept the VHS of UTF-8 over the Betamax of whatever you're looking all cow-eyed at today.

And, in case you were wondering, ASCII is never the right answer.

2014-12-16

Day 16: Web::Machine

Web::Machine is pretty cool because it reorganises the way you think about your website's structure, focusing on the perspective you should really be starting with in the first place.

Web::Machine encourages you to construct several objects, each of which handles a URI by representing the resource to which that URI points.

Remember that URI is a Uniform Resource Identifier. We've had this discussion. The parts of the internet that use URIs are based on the assumption that they are sharing information about resources, and hence the focus is on the resource.

Web::Machine starts with the resource. You construct an object and mount it as Plack middleware to handle the URI to that resource. These objects are actually the machines. You construct a Web::Machine with a subclass of Web::Machine::Resource, and if that's all you want to do, you call ->to_app on it and plack it up.

Each Web::Machine so constructed is a Plack::Component. That means you can bring in a Plack::Builder and mount machines in it.

my $builder = Plack::Builder->new;
$builder->mount(
    '/resource' => Web::Machine->new( 
        resource => 'MyApp::Resource'
    )
);

Alternatively, you might prefer to use something like Path::Router, providing subs that build Web::Machines based on arguments.

my $router = Path::Router->new;
$router->add_route('/resource/:id' => sub {
    my ($req, $id) = @_;
    Web::Machine->new(
        resource => 'MyApp::Resource',
        resource_args => [
            id => $id,
        ],
    )
    ->call($req->env);
});

Two things are notable about this particular invocation. First, it is necessary to run call on the resulting machine manually. The second is that, now that we have actual args coming in, we're seeing how Web::Machine takes an array ref for these, not a hashref; i.e. it's an argument list and not required to be hash-shaped.

MyApp::Resource is what handles the actual magic: Web::Machine expects certain subroutines to be overridden from the base class Web::Machine::Resource that define what this resource can do.

The sensible ones to provide are content_types_provided and the to_* filters that define how to represent this resource as the various content types it supports.

The documentation lists all of the functions that can be overridden to provide behaviour specific to this class.

RFPR: Web::HyperMachine

I've started taking this a step further. Resources are only part of what makes the interwebs work. The other part is the fact the resources are related to each other: hypermedia.

Up on the githubs is a start to the module Web::HyperMachine, which tries to wrap Web::Machine in an understanding of how the resources relate to one another. By adding a couple of DSL-like functions to the Resource class it is possible to automatically construct the URI schema for the system, using the declared names of resources and relationships within the resource classes themselves.

The user simply mounts those resources and the machine does the rest:

#!/usr/bin/perl
use strict;
use warnings;
use Web::HyperMachine;

my $app = Web::HyperMachine->new;
$app->with('MyApp::Resource');      
$app->to_app;

And the resource would be e.g.:

package MyApp::Resource;
use strict;
use warnings;

use parent 'Web::HyperMachine::Resource';

__PACKAGE__->uri('resource');

our @data = qw( hello hi hey howdy );

sub content_types_provided { [{ 'text/html' => 'to_html' }] }

sub fetch {
    my ($self, $id) = @_;
    return $data[$id];
}

sub to_html {
    my $self = shift;
    my $resource = $self->{resource};

    q{<h1>} . $resource . q{ world</h1>}
}

1;

If you plackup that script, you'll find that /resource/01 will return an HTML page with "Hello world" in it; and other values will correspondingly index into the array.

Feedback on this concept is encouraged; it's not been worked on for some time, like most things I do, because I got bored of it, because I didn't have an actual use for it.

1 If 0 doesn't appear to work, you may have an outdated version of Path::Router. The issue tracker says it is fixed on CPAN now.

2014-12-15

Day 15: Crime and Punishment

In today's post I'm going to try to convince you to think of the interfaces you make in terms of punishment, in order to find the path of least punishment.

Here's a perspective for you to consider: when someone uses your system, they are doing you a favour. Don't try to yes-but-what-if your way out of this; I'm not asserting that it is the case. I am saying that is how you should consider it to be. Assume that the user, given the option, will pick an alternative system. Design the interface from the point of view that it is the very fact people use the system that is the currency that measures its success. If people don't like using it, if you make it hard to do, they simply will stop doing so.

This is an important perspective if you are a business, because your system needs to get the user from state 1, wherein they have their money, to state 2, wherein you have their money. If you make that difficult to do, then they won't do it. You are not doing them a favour; don't treat them like you are.

Punishment

Punishment probably makes you think of unwanted tasks doled out to people for correction or restitution of some misdemeanour or other. This is a bit of a goal-oriented definition, because it implies a perpetrator in the first place; i.e. it expects that some misdeed has been undertaken for which recompense needs to be made.

People are, of course, falsely accused and given punitive action nevertheless. The focal point of the above definition is that of an unwanted task; some chore that must be gone through, which one is inconvenienced, perhaps embarrassed or humiliated, to do. The concept is one of a strong antipathy or disinclination to do the thing; hence it is considered punitive to require that the person do it.

Crime and Punishment

When you design an interaction between a human and a computer you are establishing a sequence of events that will allow the user to eventually find themselves in a situation whereby the thing they set out to do has been done. Within this highly abstracted scenario there are three players:

  • You (the entity with which the task is being performed)
  • The user (the entity trying to perform the task)
  • The task (the sequence of events by which the thing moves from not-done to done)

This set of three players has implied with it several types of tasks:

  • Expected but trivial; these things do not inconvenience
  • Expected but undesirable; the user has prepared for this
  • Unexpected but trivial; these things are minor inconveniences
  • Unexpected and undesirable; necessary evils
  • Unexpected and undesirable and avoidable; punishment

When you design an interface and you've added something to that interface, seriously consider whether that thing can be considered punishing the user for something they didn't do wrong.

Especially consider whethere it is punishment for something out of their control. In many cases it is necessary to inform the user that there was a problem; this may seem like punishment, because it is quite undesirable to have to go through all that again.

Well, it is. Reduce the impact of problems by not discarding all the information the user has entered. If the problem is on your side, don't force the user to pick up the pieces, because they won't. If the problem is on their side, only require the re-entry of that information - not the entire thing.

And if there isn't a problem, why are you making one?

Amazon

Amazon punished me recently. They have this 1-Click registered-trademark button that allows you to find something you want and have it on its way to you just by pressing a button. That's a great feature - they are absolutely doing me a favour by having it. And they do me a second favour by letting me amend the order for up to 30 minutes after it's created.

Then they punish me for wanting to do that.

If you try to change the delivery address of such an order you are required to "confirm" your payment details. Why? They told me (on Twitter) that it was a security precaution to prevent others from accessing my personal information.

What utter, rotten bullshit. This is rubbish design, pure and simple. If I didn't change my delivery address, I would not have to confirm anything! This is unexpected, undesirable, and completely avoidable. It is punishment for wanting to have it delivered somewhere else. That is not a punishable offence.

SimplyBe

I get very upset sometimes. SimplyBe are absolutely not the sort of company that want me to give them any money. Every single step in between me selecting a product and me paying for the product was a pain in the arse.

Here are the necessary evils of buying something online:

  • Entering your payment details
  • Telling them where to send the product

That is it. Everything else beyond that is you not doing me a favour. Sometimes we accept certain things, like do you want to sign up for the newsletter? (No.) But there are really only two things a place needs to know about you in order to get your money from your pocket and into theirs. If they punish you for trying to do that, go somewhere else.

For the curious, my tirade can also be seen on Twitter, written live as I came across the problems with the checkout. Finding it is left as an exercise to the reader. Every single tweet in that set is about something I consider a punishment, and I consider myself as having been punished for wanting to give them money.

Metro 2033

I first started thinking about interfaces in terms of punishment while playing this game, Metro 2033, of which many readers may have heard. It was touted as one of the best games of whatever year I missed it in when it first came out. It's set in the subway of Moscow - the Metro - where humanity has retreated from whatever disaster has yet to be revealed.

The game goes, by stages, from stealth to survival to legging it to brawling to just wandering around in a township buying stuff. And it punishes you.

Progress in the game is saved by a checkpoint mechanic, although it doesn't tell you where the checkpoints are. All you know is that, if you die, you're going to be set back some arbitrary distance; although once you've failed once, you know where you're going to go back to.

The game is therefore, at the abstract level, a series of challenges that must be overcome in order to progress; failure in a particular challenge sets you back to, at best, the start of that challenge or, at worst, the start of the level. You don't know where until you fail a challenge, but when you've failed a challenge you have some idea of the new worst-case scenario.

The problem is that some challenges are more, well, challenging than others, but failing them causes you to have to repeat the less obnoxious ones in order to retry the difficult one. In a save-when-you-want game you would simply save before you reached the difficult challenge, in order to avoid repeating the easy ones more than once.

This reduces the easy challenges to chores, trivial tasks that you gradually become adept at and simply have to slog through to try the part you keep failing at, until eventually you find the secret to the difficult part. This quickly stops being entertaining.

Games should not be chores. Chores are punishment.

Incidentally, the game (so it calls itself) has another punishment mechanism: traps. Consider the welcome form of punishment, whereby you are set back for failing a challenge - this is the expected function of a game, since a game is supposed to be entertaining by presenting a challenge, and a challenge you can't fail is not a challenge at all. The trap I'm talking about is not a trap for the character in the game, but a trap for the player. In the game, traps are visible and have a disarming mechanism; but traps for the player are unexpected, random events. Unexpected, undesirable, but avoidable by the designer.

Twice, so far, the game has required me to be discreet, quiet, stealthy - this means light off - and then punished me by leaving traps in the dark. Things I cannot have avoided by using skill - points in the game where the only two approaches to the challenge would have caused me to fail. Damned if you do, and damned if you don't. The only way to beat the challenge is to have failed it at that point once already. How do I know there won't be another trap ahead? This challenge has become a chore.

Codec1

Maintain flow. Most of the things I've listed as examples of punishment are flow-breaking. Most of the time, the user doesn't want to have to know how to perform the task; they need to be prompted to enter information, and as little information as possible. Every step along the way is a step further away from them achieving their goal, and the value of your system is entirely measured in how many people use it to achieve their goals.

Common punishments include:

  • Forcing the user to manually type information they use a computer to automate in the first place (autofill forms, or refusing to let me paste my generated passwords into the confirmation box).
  • Repetition of trivial tasks that shouldn't have to be done at all.
  • Requirement of information you don't strictly need.
  • Considering valid data to be invalid because your validation is broken (or vice versa).
  • Similarly, rejecting sensible input because you're scared of it (like most of my randomly-generated passwords).
  • Pretending to let you do something, and then moving the goalposts and not actually doing it.
  • Not providing sufficient information to help the user rectify the problem.
  • Fragmenting input forms across multiple pages.
  • Cramming a single page with too much input.
  • Discarding information because your fragile system shat itself.
  • Choosing difficult fonts and colours to read.
  • Making the user hunt for the next thing they have to do.
  • Related, leaving the user at the end of a process with no confirmation or failure message, so they don't know that they're done, or feeling that they have to do it all again.

I'm sure if I use the internet for another day I'll be able to double this list but you get the idea. For every action the user has to take, is it something they've prepared for, and do they actually have to do it?

1 [sic]

Day 12ish: PERL

PERL is wrong. It was invented at some point to mean Practical Extraction and Report(ing) Language but Perl was never called that originally.

Although I do quite like the interpretation Poor Excuse for a Real Language, which unfortunately doesn't initialise to PHP.

There's also a swathe of awful, ancient code written in Perl.

This legacy dogs Perl's steps, despite the recent rise of Perl like an X-Wing rising out of Dagobah swamps.

Thus I propose a naming convention: Anything that can be considered to be dragging Modern Perl down be referred to as PERL code. It's clear how PERL is indeed a pathetic excuse for a real language. Perl resembles PERL as much as Episode IV resembles Episode I.

PERL is dead. Long live Perl.

2014-12-11

Day 11: List context and parentheses

It's common to start off believing that () make a list, or create list context. That's because you normally see lists first explained as constructing arrays:

my @array = (1,2,3);

and therefore it looks like the parentheses are part of list context.

They aren't. Context in this statement is determined by the assignment operator. All the parentheses are doing is grouping up those elements, making sure that all the , operators are evaluated before the = is.

There is exactly one place in the whole of Perl where this common misconception is actually true.

LHS of =

On the left of an assignment, parentheses create list context. This is how the Saturn operator works.

$x = () = /regex/g;
#   |______________|

The marked section is an empty list on the left-hand side of an assignment operator: the global match operation is therefore in list context.

LHS of x

This is a strange one. The parentheses do construct a list, but the stuff inside the parentheses does not gain list context.

my @array = (CONSTANT) x $n;

In this case, CONSTANT - presumably sub CONSTANT {...} - is in list context; x gains list context from the =, and CONSTANT inherits it.

my $str = (CONSTANT) x $n;

Here we have x in scalar context because of $str, and CONSTANT in scalar context because of that. This is not really a whole lot of use, however.

Various Contexts

This sub reports whether it's called in scalar, list or void context1:

sub sayctx { say qw(scalar list void)[wantarray // 2] }

Now we can test a few constructs for context:

# void
sayctx;

# scalar
scalar sayctx;

# scalar
my $x = sayctx;

# list
my @x = sayctx;

# list
() = (sayctx) x 1;

# scalar
my $x = (sayctx) x 1;

# list
last for sayctx;

# scalar
while (sayctx) { last }

# scalar
1 if sayctx;

# scalar, void
sayctx if sayctx;

# scalar, scalar
sayctx > sayctx;

1 Understanding it is left as an exercise to the reader.

2014-12-10

Day 10: Fixes to DBIx::Class::InflateColumn::Boolean

I'm finding my new position at OpusVL ever more valuable. We like to put extra time into getting to the bottom of an issue because we rely so heavily on open-source software. Problems we discover in the modules we use are worth investigating for their own sake, simply because the amount of time already put into the modules by other people is years; years we didn't have to spend ourselves.

Today I discovered that, if I ran my Catalyst application under perl -d, it didn't actually run at all.

After much involvement from various IRC channels I came to the conclusion that the problem was in Contextual::Return; or rather, the problem was in the 5.14 debugger, since it seems OK in 5.20.

Anyway, Contextual::Return was employed by DBIx::Class::InflateColumn::Boolean, which I was using because SQLite doesn't have ALTER COLUMN. We test components of Catalyst applications as small PSGI applications with SQLite databases backing them, which has its own problems, but in this case the issue was the column in question being closed boolean NOT NULL DEFAULT false, and SQLite not translating "false" as anything other than the string "false", and then shoving it in a boolean column anyway.

So DBIC faithfully gave me "false" back when I accessed the row, and "false" is true, so everything broke.

So I inflated the column.

This all resulted in a patch to DBIC:IC:Boolean, authored by haarg, removing the dependency on Contextual::Return entirely.

This may be a case of avoiding rather than fixing the problem, but since the problem appears to exist in the 5.14 debugger, the only way to fix that is to update to 5.20 - or whenever it was that it was fixed.

It also prompted me to rebuild the SQLite database to remove that default. Turns out DBIC doesn't fill in default values when creating rows.

2014-12-09

Day 9: Scalar filehandles, or IO, IO, it's not to disk we go

Did you know you can open a variable as a file handle?

This is a great trick that avoids temporary files. You can write to the filehandle, and the stuff written thereto are available in the other variable. I'm going to call the other variable the "buffer"; this is a common term for a-place-where-data-get-stuffed.

Here's an example whereby I created an XLS spreadsheet entirely in memory and uploaded it using WWW::Mechanize. The template for the spreadsheet came from __DATA__, the special filehandle that reads stuff from the end of the script.

This allowed me to embed a simple CSV in my script, amend it slightly, and then upload it as an XLS, meaning I never had to have a binary XLS file committed to git, nor even written temporarily to disk.

In the example below, a vehicle, identified by its VRM (registration plate) is uploaded in an XLS spreadsheet with information about its sale. The $mech in the example is ready on the form where this file is uploaded.

The main problem this solves is that the VRM to put into the spreadsheet is generated by the script itself, meaning that we can't just have an XLS file waiting around to be uploaded. As noted, it is also preferable not to have to edit an XLS file for any reason, essentially because this can't be done on the command line - LibreOffice is required, or some Perl hijinks.

open my $spreadsheet_fh, ">", \my $spreadsheet_buf;       # [1]
my ($header, $line) = map { chomp; [split /,/] } <DATA>;  # [2]
my $xls = Spreadsheet::WriteExcel->new($spreadsheet_fh);  # [3]
my $sheet = $xls->add_worksheet();

# processing

$line->[0] = $vrm;

$sheet->write_col('A1', [ $header, $line ]);              # [4]
$xls->close;

$mech->submit_form(
  with_fields => {
      file => [ [ undef, 'whatever', 
          Content => $spreadsheet_buf ],                  # [5]
      1 ]
  },
  button => 'submit',
);

# [5]
__DATA__
VRM,Price,Fees,Collection,Valeting,Prep costs
,2333,10,0,10,0

The key to this example is in [1], which looks like a normal open call except for the last expression:

\my $spreadsheet_buf;

This is a valid shortcut to declaring the $spreadsheet_buf and then taking a reference to that:

my $spreadsheet_buf;
open my $spreadsheet_fh, ">", \$spreadsheet_buf;

The clever part is that now, $spreadsheet_fh is a normal filehandle that can be used just like any other; just as if we'd used a filename instead of a scalar reference. At [3] you can see a normal Spreadsheet::WriteExcel constructor, taking a filehandle as the argument, as documented.

At [2] you can see DATA in use, which reads from __DATA__ at [5]. This also acts like a normal filehandle; <DATA> reads linewise, and we have to chomp to remove the newlines.

We map over these lines, chomping them and using split /,/ to turn them into lists of strings; and this list is inside the arrayref constructor [...], meaning we get an arrayref for each line.

At [4] we have processed sufficiently to have installed the VRM in the gap at the front of the second line, i.e. the zeroth element of $line, so write_col is employed to write both arrayrefs as rows (yes I know) into the spreadsheet.

When we call $xls->close, this writes the spreadsheet to the filehandle. But no file is created; instead, the data go to $spreadsheet_buf. If we were to print $spreadsheet_buf to a file now, we would get an XLS we can open.

Instead, at [5], we use the trick documented in submit_form (ether++ for reading everyone's mind) to use the file data we already have as the value of the form field.

This trick is remarkably useful. You can reopen STDOUT to write to your buffer:

{
    local *STDOUT;

    open STDOUT, ">", \my $buffer;

    do_stuff_that_prints();

    do_stuff_with($buffer);
}

but that's better written

my ($buffer) = capture { do_stuff_that_prints() };

from Capture::Tiny.

See also

If you use IO::Handle then your $spreadsheet_fh will be an object like any other - but these days, you get that simply by using lexical filehandles anyway.

IO::Scalar seems like a decent OO-type module to deal with this but also look nice.

IO::String also works with strings-as-IO.

I've not tried either of these latter two, but YMMV etc.