<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
  xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>kvz.io</title>
    <link>https://kvz.io</link>
    <description>A blog on building: software, infra, and <a href="https://transloadit.com">Transloadit</a></description>
    <pubDate>Tue, 26 Jul 2022 00:00:00 +0200</pubDate>
    <item>
      <title>Setting up macOS for JS development</title>
      <link>https://kvz.io/macos-install.html</link>
      <description><![CDATA[Three years ago MacBooks were in a pretty bad spot for me and I switched to Ubuntu, later Pop!_OS. It was a fun ride. While coding I felt very productive because the OS is so low in distractions and just feels incredibly responsive. Installing the world via apt beats brew 10x, and native Docker on Linux is so much faster it isn't even funny. We had to abandon Docker because we had folks with macOS on the team. But, other tasks (email, conference calling, scanning, word, upgrading without breakage) came with more friction, and those tend to fill up ever larger shares of my day.
]]></description>
      <pubDate>Tue, 26 Jul 2022 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/macos-install.html</guid>
      <content:encoded><![CDATA[Three years ago MacBooks were in a pretty bad spot for me and I [switched to Ubuntu](/tobuntu.html), later Pop!\_OS. It was a fun ride. While coding I felt very productive because the OS is so low in distractions and just feels incredibly responsive. Installing the world via `apt` beats `brew` 10x, and native Docker on Linux is so much faster it isn't even funny. We had to abandon Docker because we had folks with macOS on the team. But, other tasks (email, conference calling, scanning, word, upgrading without breakage) came with more friction, and those tend to fill up ever larger shares of my day.

With MacBooks in a better place, I'm back on a Mac. That is, I'll keep Linux close and I'm very grateful to have options now should Apple pull more shenanigans (or we don't manage to port our stack to ARM before second hand Intel macs run out (following [this](https://developer.apple.com/documentation/virtualization/running_intel_binaries_in_linux_vms_with_rosetta) closely too)), but macOS will be my main driver again for some time.

Things to like after my first few days again:

- easy access to messages, reminders, notes, photos from my phone
- trackpad support is on another level and this directly translates to productivity for me
- copy-paste working intuitively and reliably out of the box
- no more "sorry audio device is gone rebooting i'm on linux" when videoconferencing
- some apps are more accessible, or more polished (like Word, or GitHub Desktop)
- unlocking and paying with fingerprint scanner (touchid) works very well, same for getting past `sudo` auth. And when the lid is closed, an my apple watch will unlock the machine when i'm close
- i did try gnome-sushi but it's no match for preview, and allows me to plow through administration much more effortlessly. copying from previewed pdfs works out of the box, with sushi i had to first open the file for example
- searching in finder works much better, e.g. it also searches for a pdf's contents by default, which is great for me

From experience I've learned that if you can, best set up an OS from scratch and only then apply any hacks and apps that you still find relevant, vs dragging a long tail of experiments into every new release via cloning/upgrading from old machines. It's good to let apps & hacks deserve their own place again by starting fresh some times.

So I reserved a day and documented the steps to make macOS suitable for my use again. This will be highly subjective, but who knows, maybe you can draw some inspiration from it, or spot something I'm clearly doing wrong and I learn a new trick or two. And if not, I at least have repeatable steps for a Mac after this or some reinstall after I screw up.

Without further ado..

## Update and Configure macOS:

First let's

- Upgrade to the latest OS version available via `System Preferences » Software Update`
- Set hostname `System Preferences » Sharing » Local hostname`
- Set up Hot corners. Weirdly enough you can find this under `System Preferences » Desktop » Screen Saver » Hot Corners...` . I bind Upper Left to Mission Control, Upper Right to Desktop. Note btw that the Bottom Right can create Notes since macOS 12 Monterey
- Drag the Desktop folder into the Dock and set it up as a `Fan`, sort by `Last Modified` for easy access. For Downloads this is already the case.
- [Unlock your with Apple Watch](https://support.apple.com/en-us/HT206995): `System Preferences » Click Security & Privacy » General » Use Apple Watch to unlock apps and your Mac`. If you have more than one Apple Watch, select the watches you want to use to unlock your apps and Mac.
- Exclude some garbage locations from Spotlight. `System Preferences » Spotlight » Privacy » Add`. Consider excluding the entire `~/code` directory to get rid of `node_modules`. Code is not typically found via Spotlight anyway. `Accounting/automation` is another good candidate.
- Finder:
  - `Finder »`
    - `Preferences »`
    - `General »`
      - `New Finder windows show: $USER`
    - `Advanced »`
      - `Show all filename extensions`
      - `Remove items from the Trash after 30 days`
      - `Keep folders on top: In windows when sorting by name`
      - `Keep folders on top: On Desktop`
      - `When performing a search: Search the Current Folder`
    - `View »`
      - `Show Path Bar`
      - `Show Status Bar`

## Install Apps from the AppStore

From the AppStore let's install:

- [1Password](https://apps.apple.com/nl/app/1password-7-password-manager/id1333542190)
- [Sequel Ace](https://apps.apple.com/nl/app/sequel-ace/id1518036000) (this replaces the now unmaintained Sequel Pro)
- [Slack](https://apps.apple.com/nl/app/slack-for-desktop/id803453959)
- [Pixelmator](https://apps.apple.com/nl/app/pixelmator-classic/id407963104)
- [Reeder](https://apps.apple.com/nl/app/reeder-5/id1529448980)
- [UTM](https://apps.apple.com/nl/app/utm-virtual-machines/id1538878817?l=en&mt=12)
- [Word](https://apps.apple.com/nl/app/microsoft-word/id462054704) (I've tried to avoid but concluded that a `.docx` this is a lawyer's self-contained emailable git repo and really the only way I can collaborate with them)

Open Photos. Set it up to: `Save Originals on Mac` (this may be a lot of data. What makes it worth it for me is that I also backup this drive and then I can trust my life's photos won't be wiped out if there ever is some iCloud malfunction)

## Install Apps from the web

- Install [Rectangle Pro](https://rectangleapp.com/pro) (so you drag windows by moving the pointer anywhere in them with `Option`, and resize them with `Option + Shift`). Requires a $10 license. Go to `Settings » Sync configuration over iCloud`
- Install [Brave](https://brave.com/). Set it up to: Sync Chain from mobile phone, disable wallet, ads, etc
- Install [Dropbox](https://www.dropbox.com/install). Set it up to keep files off-line, but add Selective Sync just on the important folders (like `Config`).
- Install [iTerm2](https://iterm2.com/downloads.html). Set it up to load profile from `~/Dropbox/Configs/iTerm2/`~ <-- I tried regular Terminal as a replacement but I couldn't configure it to open source files at the correct line by clicking on them. Saves me a lot of time interfacing with tsc and eslint so staying on Iterm2 for now, even if it's a bit slower (and an extra install)
- Install [WhatsApp](https://www.whatsapp.com/download)
- Install [Signal](https://signal.org/download/)
- Install [Spotify](https://www.spotify.com/nl/download/mac/)
- Install [VS Code](https://code.visualstudio.com/download), enable sync, instruct to install `code` in PATH
- Install [VirtualBox](https://www.virtualbox.org/wiki/Downloads)
- Install [GitHub Desktop](https://desktop.github.com/)
- Install [Zoom](https://zoom.us/download)
- Install [Fujitsu ScanSnap Home for the iX500](https://scansnap.fujitsu.com/global/dl/mac-1200-ix500.html) and ABBY FineReader for ScanSnap (it will propose to install this). Set it up to "Scan to Abby".

## Install Apps from Homebrew

In iTerm2:

```bash
# Switch prompt to Bash
chsh -s /bin/bash

# Install XCode Command Line Tools
xcode-select --install

# Install Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install CLI software
brew install \
  awscli \
  bash-completion@1 \
  bat \
  corepack \
  coreutils \
  curl \
  diffutils \
  exiftool \
  gh \
  git \
  git-lfs \
  google-cloud-sdk \
  gpg \
  html2text \
  htop \
  imagemagick \
  ipcalc \
  jq \
  mc \
  mosh \
  nodejs \
  pinentry-mac \
  poppler \
  shellcheck \
  starship \
  telnet \
  tmux \
  vagrant \
  watch \
  wget \
  yarn-completion \
&& true

# Powerline fonts for Starship prompt
(
  cd "${TMPDIR}" \
    && git clone https://github.com/powerline/fonts.git --depth=1 \
    && cd fonts \
    && ./install.sh \
    && cd .. \
    && rm -rf fonts \
    && true
)
```

## Setup CLI & dev tools

In iTerm2:

```bash
# Save any custom script here
mkdir ~/bin

# You may need to occasionally rerun this, but caching 'gh' completions like this saves
# noticeable latency when opening a new prompt/tab:
gh completion -s bash > ~/.bash_completions_gh

# Avoid VirtualBox Valid ranges error
sudo mkdir -p /etc/vbox/
sudo tee "/etc/vbox/networks.conf" > /dev/null <<'EOF'
* 10.0.0.0/8
* 192.168.33.0/24
* 2001::/64
EOF

# Allow Touch ID to unlock sudo (from: https://news.ycombinator.com/item?id=32611340)
if ! grep -q "pam_tid.so" /etc/pam.d/sudo; then
  echo "Touch ID not enabled for sudo. Inserting in /etc/pam.d/sudo .."
  sudo perl -pi -e 's/(pam_smartcard.so)/$1\nauth sufficient pam_tid.so/' /etc/pam.d/sudo
fi

cat << 'EOF' > ~/.config/starship.toml
[aws]
disabled=true
[gcloud]
disabled=true
EOF

cat << 'EOF' > ~/.bash_profile
# PATH
export PATH=/opt/homebrew/bin:/usr/local/opt/coreutils/libexec/gnubin:${HOME}/bin:${HOME}/code/dotfiles/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin

# Starship Prompt
eval "$(starship init bash)" || true

# Do not mention we should switch to ZSH
export BASH_SILENCE_DEPRECATION_WARNING=1

# https://stackoverflow.com/a/52901834/151666
export LC_CTYPE="en_US.UTF-8"

# Append to the Bash history file, rather than overwriting it
shopt -s histappend

# Bash aliases
alias ..="cd .."
alias upb="yarn add \$(npm outdated |grep -E '^(@types/)?(eslint|sucrase|jest|babel|typescript|@babel|@sucrase|@typescript)'|awk '{print \$1}')"
alias e="vscode-workspace.sh"
# ^-- that file is in my ~/dotfiles/bin and among other things does:
# [ -f ".vscode/$(basename "${PWD}").code-workspace" ] && code ".vscode/$(basename "${PWD}").code-workspace" || code .
# this way I can enter any project and just execute `e` to open it correctly in VS Code

# Load completions
# [[ -r /usr/local/etc/profile.d/bash_completion.sh ]] && source /usr/local/etc/profile.d/bash_completion.sh
# ^-- commented out for latency reasons
[[ -f /usr/local/etc/bash_completion ]] && source /usr/local/etc/bash_completion
[[ -f /opt/homebrew/etc/bash_completion ]] && source /opt/homebrew/etc/bash_completion
[[ -f ~/.bash_completions_gh ]] && source ~/.bash_completions_gh
[[ -f /usr/local/etc/bash_completion.d/yarn ]] && source /usr/local/etc/bash_completion.d/yarn
[[ -f /usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/path.bash.inc ]] && source /usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/path.bash.inc
EOF
```

## Node.js

Already installed higher up in the Homebrew command, but these will also come in handy:

```bash
yarn global add updtr sucrase
npm login
```

## More responsive typing

- [Increase key repeat rate on macOS](https://gist.github.com/hofmannsven/ff21749b0e6afc50da458bebbd9989c5) (`System Preferences » Keyboard » Key Repeat/Delay Until Repeat` seems enough for my use but you can go wilder via cli)
- [Optimize your shell profile](https://twitter.com/thorstenball/status/1364458766368968705) (e.g. eliminated calls to e.g. `brew --prefix` and cached `gh` autocomplete in my `~/.bash_profile` in my case)

## Setup Git + LFS

```bash
git lfs install
sudo git lfs install --system
git config --global pull.rebase true
git config --global push.autoSetupRemote true
git config --global alias.conflicts "diff --name-only --diff-filter=U"
git config --global alias.lol "log --pretty=oneline --abbrev-commit --graph --decorate"
git config --global alias.co "checkout"
git config --global alias.st "status -sb"
```

## Setup Git SSH Keys for cloning

```bash
ssh-keygen -t rsa -b 4096 -C "${USER}@${HOSTNAME}" # enter, enter, enter
cat ~/.ssh/id_rsa.pub
# and add the contents as a key to https://github.com/settings/keys
```

## Setup Git Verified commits

This assumes you have keys already. If you do not check out a full [walkthrough](https://openmdao.org/newdocs/versions/latest/other_useful_docs/developer_docs/signing_commits.html)

```bash
# Get key id from https://github.com/settings/keys
keyID="xxxxx"

mkdir -p ~/.gnupg
ln -nfs ~/Dropbox/Configs/gnupg ~/.gnupg
chmod 700 ~/.gnupg
git config --global user.signingkey "${keyID}"
git config --global commit.gpgsign true
git config --global gpg.program $(which gpg)
git config --global user.email kevin@vanzonneveld.net

if ! grep -q "pinentry-program /usr/local/bin/pinentry-mac" ~/.gnupg/gpg-agent.conf; then
  echo "pinentry-program /usr/local/bin/pinentry-mac" >> ~/.gnupg/gpg-agent.conf
fi

gpgconf --kill gpg-agent
```

## Cloud tools

```bash
gcloud init
aws configure
```

## Setup Github CLI + aliases

```bash
gh auth login
```

```bash
gh alias set todo-meta 'issue create --repo=transloadit/team-internals --project="🤖 The Board" --body="n/a" --label="meta" --assignee="@me"' \
  && gh alias set todo-founder 'issue create --repo=transloadit/founder-internals --project="🤖 The Board" --body="n/a" --label="meta" --assignee="@me"' \
  && gh alias set todo-accounting 'issue create --repo=transloadit/accounting --project="🤖 The Board" --body="n/a" --label="accounting" --assignee="@me"' \
  && gh alias set todo-legal 'issue create --repo=transloadit/legal --project="🤖 The Board" --body="n/a" --label="legal" --assignee="@me"' \
  && gh alias set todo-website 'issue create --repo=transloadit/content --project="🤖 The Board" --body="n/a" --label="website" --assignee="@me"' \
  && gh alias set todo-content 'issue create --repo=transloadit/content --project="🤖 The Board" --body="n/a" --label="content" --assignee="@me"' \
  && gh alias set todo-nix 'issue create --repo=transloadit/api2 --project="🤖 The Board" --body="n/a" --label="nix" --assignee="@me"' \
  && gh alias set todo-api2 'issue create --repo=transloadit/api2 --project="🤖 The Board" --body="n/a" --label="api" --assignee="@me"' \
  && gh alias set todo-growth 'issue create --repo=transloadit/growth --project="🤖 The Board" --body="n/a" --label="growth" --assignee="@me"' \
  && gh alias set todo-botty 'issue create --repo=transloadit/botty --project="🤖 The Board" --body="n/a" --label="satellite" --assignee="@me"' \
  && true
```

Now when I think of a new Website todo I type `gh todo-website`, fill out a title, and done. Optionally I CMD+Click the link which takes me to the issue on Github.com to fill out more details.

## Setup Git Repos

```bash
mkdir ~/code
# sync code + env files from existing machine via ssh
rsync \
  -a \
  --progress \
  --ignore-existing \
  --exclude='node_modules/' \
  --exclude='.Trash-*' \
  --exclude='.vagrant' \
192.168.68.92:/Users/kvz/code/ ~/code \
&& true

cd ~/code/api2
vagrant plugin install vagrant-vbguest
ln -nfs ${HOME}/code/api2/core/bin/tlc.sh ~/bin/tlc
chmod 755 ~/bin/tlc
```
]]></content:encoded>
      <dc:date>2022-07-26T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Steps to Convert JavaScript to TypeScript</title>
      <link>https://kvz.io/js-to-ts.html</link>
      <description><![CDATA[I've been slowly falling in love with TypeScript. I have a thousand little JS projects. Small prototypes with minimal tests and documentation. Often just to help me get a thing done. Typically when I revisited those after some months,
I would be a complete stranger with no mental map of all the components or constraints or decisions that led to things being as they are.
]]></description>
      <pubDate>Thu, 25 Nov 2021 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/js-to-ts.html</guid>
      <content:encoded><![CDATA[I've been slowly falling in love with TypeScript. I have a thousand little JS projects. Small prototypes with minimal tests and documentation. Often just to help me get a thing done. Typically when I revisited those after some months,
I would be a complete stranger with no mental map of all the components or constraints or decisions that led to things being as they are.

When I wrote it it all made sense, looking at it later, I would be disappointed in my brain's ability to retain the ins & outs of this project and it would feel like a minefield. Any step taken to change things could mean an explosion and the loss of something dear.

But I've been playing around with TypeScript since a year or so, and when I revisit those projects, even though they were created in similar haste and vain, the code is more self-explanatory and it is actually hard to make missteps.

Even though I'm a TypeScript amateur, I did want more of this. But, as indicated I have many small projects laying around, so I need to simplify migration to TypeScript.

Folks on the internet say: "Well TypeScript is just a superset of JavaScript so you can simply rename .js to .ts and gradually fix warnings, and then turn them into errors."

That's kinda true but the reality is more involved as the documenting of my steps in this blog will demonstrate.

So I've tried to leverage computers - being okay with manual work - but adding machines insofar they could reasonably quickly be set up to automate things.

For instance, the automation used here will not figure out types by itself. It just aims to get you past the first bumps in the road, after which you will have a TypeScript project, after which you can add fixes in smaller increments, basically removing occurences of `any` and `// @ts-expect-error` as you go, getting you increasingly more benefits from TypeScript, while also making the first migration step not take more than an hour or so.

As a dad and [co-founder](https://transloadit.com), an hour here and there is what I can spare, so this splits the work in bite-sized chunks.

I assume (do make changes to accommodate your own setup):

- Node 14+
- Jest as your test runner
- Eslint as your linter and formatter
- Yarn as your package manager
- Your files are safe under Git, and you created a fresh branch `ts` in sync with latest HEAD
- You understand this is a very risky process and you are ready to revert and/or put in manual work as well
- Your sources files are under `src/` and `test/`
- You have a back-end program. This does not accommodate for Webpack, React, etc.

## Install dependencies

First I installed a number of dependencies:

```bash
yarn --dev add \
  @types/jest \
  @typescript-eslint/eslint-plugin \
  @typescript-eslint/parser \
  eslint-import-resolver-typescript \
  eslint-plugin-prefer-import \
  replace-require-with-import \
  ts-jest \
  ts-migrate \
  ts-node \
  typescript \
&& true
```

## Initialize

Then I initialized a `tsconfig.json` via: 

```bash
npx tsc --init
```

## Migrate

Now let's use [AirBnB's ts-migrate tool](https://github.com/airbnb/ts-migrate) which takes care of renaming `.js` -> `.ts`, adding properties on ES6 classes, a number of other things, and commiting to Git in each step.

```bash
npx -p ts-migrate -c "ts-migrate-full ."
```

## CommonJS -> ESM

Now let's switch CommonJS `require` to ESM/TypeScript `import`:

```bash
require2import ./src/**/*.ts
npx require2import ./{src,test}/**/*.ts
```

this only took us so far, we'll need to make ESM compatible as well. By replacing:

- `module.exports = ` -> `export default ` (this will require manual checking and fixing, and don't do this on the `.eslintrc.js` file)

the require2import script made some mistakes too so I had to replace (just some examples to give you an idea):

- `import debug from 'depurar'('botty')` -> `import depurar from 'depurar'\nconst debug = depurar('botty')`
- `const abbr\s+=\s+require\('@kvz/abbr'\)` -> `import abbr from '@kvz/abbr'`
- hunt for any other `require(` occurence and replace with import manually

## Install missing types

Now I added missing types as reported by VSCode

```bash
yarn --dev add @types/{lodash,mkdirp,intercom-client,common-tags,humanize-duration,js-yaml,node-schedule,yaml-front-matter,errorhandler,uuid}
```

For those modules without [DefinitelyTyped](https://github.com/DefinitelyTyped/DefinitelyTyped) submissions, I added `// @ts-expect-error` above their imports.

## Dev Ergonomics

For dev ergonomics, in `package.json` I replaced:

- `node ./src/**/*.js` -> `ts-node ./src/**/*.ts`
- I hunted for any other `.js` occurences and decided if I wanted to auto-transpile `.ts` with `ts-node` or no. If not I opted for prefixing with `/dist/` which will contain the transpiled JavaScript.
- I also changed my Nodemon scripts to transpile in near real time: `nodemon --watch "src/**" --ext "ts,json" --ignore "src/**/*.spec.ts" --exec "ts-node src/index.ts"`

## CI

For CI, in `package.json` I added:

```json
"build": "tsc --sourceMap --strict --outDir dist/",
```

and I added `dist/` to `.gitignore`.

I added a `yarn build` step to my CI, and let production know it should now start the program in `./dist/`. For example, in production uses `yarn start`, my `package.json` would now read:

```json
"start": "node ./dist/src/index.js",
```

This way, all the TypeScript build tooling can stay in dev/ci, and production benefits from being lightweight JavaScript (faster startup time without ts-node's transpilation and less modules to distribute if you shipped with `yarn --production`).

## Jest

Now let's make Jest TypeScript aware:

```bash
yarn ts-jest config:init
```

This creates a `jest.config.js` with the appropriate preset and environment to use.

## ESLint

I fixed ESLint by adding these to my `.eslintrc.js`:

```js
parser: '@typescript-eslint/parser',
plugins: ['@typescript-eslint'],
settings: {
  'import/resolver': {
    typescript: {},
    node      : {
      extensions: ['.js', '.jsx', '.ts', '.tsx'],
    },
  },
}
rules: {
  // Missing file extension "ts" for "../../src/services/Intercom"
  'import/extensions': [
    'error',
    'ignorePackages',
    {
      js : 'never',
      jsx: 'never',
      ts : 'never',
      tsx: 'never',
    },
  ],
}
```

The required modules where already part of my first `yarn` command.

I changed the `eslint --fix` command to read this in my `package.json`:

```json
"fix": "env DEBUG=eslint:cli-engine eslint --fix . --ext .js,.jsx,.ts,.tsx",
```

## Manual fixes

I opened all files in VSCode and silenced TypeScript errors with `// @ts-expect-error` or other simple means. Elaborate fixes will have to wait for separate smaller focused pushes. Our aim right now is to have a TypeScript project in under an hour.

Finally I ran my lint fixer via `yarn fix`, as well as Jest unit tests, and fixed all they complained about. I made sure to commit often so that it was easy to roll back individual steps.

## Success?

Success. The whole process took less than an hour and CI is passing. While this is just the start and there is limited benefit due to TypeScript not fully understanding the code yet (by suppressing errors via `@ts-expect-error` and giving it types like `any`), these can gradually be resolved as I work on the codebase. 

If this post feels a bit rough around the edges, that is correct, I just took down notes from converting one project and will make changes here after I do the next. 

Having the process documnted in this place allows for a bit of a repeatable process that will hopefully make it less error prone and faster each next time I refer to - and improve it.

## Concluding

So TypeScript all the things? No, assuming you do like TypeScript, there are still cases I would probably refrain from adopting, for example if you have:

- many files going through a slow filesystem like vboxfs, a non-SSD, networked file system, or Docker on MacOS (through its secret VM). The added transpilation time may be unbearable. Slow iteration times can really break your motivation to work on a thing
- a large community-oriented project you may not want to raise the barrier of entry of. While TypeScript may be a superset of JavaScript, people writing it are a subset. You may just get less contributions if you limit yourself to it. This holds true mostly/especially for browser-based projects, where interested parties may have backgrounds in Ruby or PHP, and know just enough JS to propose a fix, but are prohibitively discouraged when they encounter TypeScript. JavaScript is ubiquitous and this may, in cases, be an all-deciding factor to your project's success.

In these cases you could consider sprinkling JSDoc type comments on top of your JS codebase to get some of the benefits without Node.js or the browser no longer understanding your code as-is.

Please let me know if you spot mistakes, have additions, or would like to share your own experiences with TypeScript. As indicated I've barely scratched the surface so I'm sure there's room for improvement and would like to learn from you.]]></content:encoded>
      <dc:date>2021-11-25T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>And Now for Something Completely Different (NL)</title>
      <link>https://kvz.io/something-completely-different.html</link>
      <description><![CDATA[So far I have not used my writing for advertisement. Even (or especially) when I'm reviewing products, I've never taken free equipment, cash, guest/sponsored content, or anything of that nature. I may link to my own company but that's about it.
]]></description>
      <pubDate>Wed, 11 Dec 2019 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/something-completely-different.html</guid>
      <content:encoded><![CDATA[So far I have not used my writing for advertisement. Even (or especially) when I'm [reviewing products](/freewrite.html), I've never taken free equipment, cash, guest/sponsored content, or anything of that nature. I may link to my own company but that's about it.

Today is different!

But also it's not. I'd like to shine a light :flashlight: on my buddy Rein's blog, not because I'm getting paid to do so, but because I think it's a genuine hidden treasure on the web. Disclaimer: you have to be able to read **Dutch**! There's a million puns in there that Google Translate will absolutely butcher.

His blog can be found over at <https://www.wateengast.nl/>, and for almost two years he **has not missed a single day** to post. I've done a [30-day challenge](https://uppy.io/blog/2019/04/liftoff-30/), with help, and it nearly killed me, so I marvel at this level of creative output. And creative it is. Every day there's a silly question (that he invites readers to submit) with a very non-obvious, often absurd, but thoughtful answer. The results are hilarious pieces and exercises in out-of-the-box thinking. 

Besides the Q&As, another joy to read are Rein's reviews. For instance, his list of [the best things](https://www.wateengast.nl/wat-zijn-de-beste-dingen). (yes period.)

If you don't know Dutch, this is as good a reason as any to start learning! :smile:

]]></content:encoded>
      <dc:date>2019-12-11T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Going from macOS to Ubuntu</title>
      <link>https://kvz.io/tobuntu.html</link>
      <description><![CDATA[It's not the first time I'm switching to Ubuntu. I've been, as they say, around the block when it comes to operating systems. I started out on MS, from DOS to XP, then Ubuntu from 5.10 Breezy to 9.10 Karmic, then on Apple from OSX 10.5 Leopard to macOS 10.14 Mojave. Both in terms of productivity and delight I had my best years on Apple and I didn't think I'd ever look back. But here we are.
]]></description>
      <pubDate>Wed, 30 Oct 2019 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/tobuntu.html</guid>
      <content:encoded><![CDATA[It's not the first time I'm switching to Ubuntu. I've been, as they say, around the block when it comes to operating systems. I started out on MS, from DOS to XP, then Ubuntu from 5.10 Breezy to 9.10 Karmic, then on Apple from OSX 10.5 Leopard to macOS 10.14 Mojave. Both in terms of productivity and delight I had my best years on Apple and I didn't think I'd ever look back. But here we are.

Why? I (guess I'm the only person alive that) didn't mind the TouchBar or lack of a real escape key (can map Caps Lock to that). And I liked having 4 USB-C+ ports that I could do anything with. But yes, **The Keyboard**.

I spent the most money I ever did on this MacBook Pro, and it's also the worst machine I ever had because the keyboard breaks down (like, it won't register the `s`). When I dock it and use a different keyboard, it's fine of course. But sometimes I visit my co-founder in Berlin, or want to work from a coffee shop, and then it's nice to have a working keyboard. So far I've brought it in for repairs three times, and **each time I'm without my workhorse for a week**. Those are unplanned holidays that are dragging my productivity--and basically my company down.

I personally also feel **macOS has taken a freefall** regarding robustness and polish, but that might be just me. It's the keyboard that ultimately made me feel just really concerned about having my productivity/company/future so tightly coupled to what Apple ships next.

So what **machine did I get?** I played it safe and went for a Dell XPS. They come with Ubuntu pre-installed so you know you won't have to go through hoops to get all the hardware going. There's a time and a place for those things, and it's called ~college :smile: Honestly: I really enjoyed [experimenting with Linux](https://www.youtube.com/watch?v=yzNjIBEdyww) and occasionally breaking it around my 20s. Back then (5.10 Breezy) it was not uncommon to compile Wi-Fi support into your kernel. I learned a lot and would not trade that experience for anything. But these days with a family and a company rightfully demanding my time, the time I spend behind a computer needs to be accounted for and it can't really be: `3 days, got webcam to work`. In short: less surprises is better for me now. There may be superior options to an XPS for sure, probably research any hardware you intend on buying for Linux compatibility (also just for things like [webcams](https://wiki.ubuntu.com/SkypeWebCams)) and you'll find good options. I do like the XPS, it feels rugged, fast, polished, it has roughly the same ports that my MacBook Pro had, battery life is good. My only regret is that I didn't go for the larger 15". I dock it a lot, but for those other times, 13" is just a tad bit too small to do serious damage for me. And, at least with this "13 version, `<PgUp>` and `<PgDn>` are awefully close to `<Left>` and `<Right>`. That will take some getting used to and I wouldn't have minded if they left them out, and `<Ctrl>Up` was `<PgUp>`, for instance.

As for **why Ubuntu**, similar reason. There may very well be superior distros on many different metrics out there (and NixOS has me tempted), but the sheer community size of Ubuntu makes sure I'm not the first one running into a problem, and I'll find a solution online before it really slows me down. [askubuntu.com is right up there with the big ones](https://stackexchange.com/sites/). Community size is a quality which's benefits may be hard to quantify yet I find pays dividends. Even when I search for a non-Ubuntu-specific Linux question, I find that adding Ubuntu to my search query often improves the results.

So, can Linux be my workhorse?

Yes. But **this is not a sales pitch**. If you walk away thinking/knowing Linux is still too much trouble, that's a fair takeway. There are sacrifices and struggles and whether those are worth it to you depends on, well, you. I don't intend to win anybody over to either side.

Ok let's dive in, I'll try to describe the things I ran into, the things I can't fix, and straight up howto's for the things I could.

## Things that bothered me immediately

- One day, **scrolling was superfast**. Turns out I had to [unplug the USB dongle that came with my mouse](https://askubuntu.com/a/360045/2222) and insert it back in.
- **Device support** is still lacking. I could not get Apple's Magic Trackpad 2 to work without [soul crushingly fragile hacks](https://www.reddit.com/r/linuxquestions/comments/7pxl5u/is_the_magic_trackpad_2_usable_in_ubuntu_wired_or/). But I guess a newer kernel is coming that will fix this :crossed_fingers:. I was unable to get my TomTom Running Watch to sync. My Fujitsu SnapScan document scanner had no Linux tools (on macOS it can automatically OCR & archive to my Dropbox). This is very dear to me so I ended up using VirtualBox with a Windows VM for that. *Edit 2021-02-08: The Magic Trackpad 2 is now working, but you do need to increase finger sensitivity via `xinput`, added instructions for this.*
- **Photos**. Leaves to be desired. This is the main reason I'll probably always at least keep an iPad or very lightweight MacBook around. But if I don't have to get a maxed out MacBook Pro <strong><sup>1</sup></strong>, I could get a high-end Linux machine *and* a modest Apple device, and still save money.
- **Copy paste** is still horribly 'broken'. I guess `<Cmd>C` isn't a thing on Linux and `<Ctrl>C` has a different meaning in terminals, so I can get with that. And I guess there are tricky/valid historical reasons for having [different clipboards](https://www.linux.com/tutorials/hone-your-desktop-clipboards-parcellite-linux/), but for the end user, it's not great if you lose your buffer when you close the originating app. Or having data in one clipboard while you need it in the other. Luckily I found a workaround that I listed further down. Wholeheartedly recommended. Paired with training muscle memory to do `<Ctrl><Insert>` (copy) and `<Shift><Insert>` (paste) on Ubuntu, that solved the problem for me.
- I invested in a screen with high DPI, but it's not '**Retina**', and the fonts don't render like they do on macOS. It seems like a small thing, but 4 weeks in, I never would have thought I still sometimes feel as though my eyes are dry and almost literally hurting :thinking: Did Steve Jobs spoil/ruin me for life? *Edit: I wrote this before my upgrade to Cosmic, and it got significantly better afterwards*
- If I close the lid of my XPS and open it 2 days later, the **battery** is fully drained. I just opened my MacBook after leaving it for weeks, and it still has juice enough to do serious work with (if the keyboard only allowed!). So it seems hibernation is much better on a Mac. *Edit: as [pointed out by GD in the comments](#comment-4671125661), this: isn't actually Linux' but MSFT's and Dell's fault for promoting "connected standby"*
- Every reboot my screen brightness is so low I can barely see a thing. The button to increase brightness is maxed out. It turned out I have to venture into the power saving settings to crank the **brightness** up there, but it does not persist across reboots. I avoid reboots now. If someone knows how to automate this let me know in the [comments below](#comments-area)!
- Selecting the right **audio/video input/output** is a proper chore. My Mac seemed to pick sane seemingly obvious defaults, whether I hooked up a screen with webcam, removed it, etc. With Ubuntu I have to open the audio settings and select the proper i/o at least twice a week as it doesn't pick obvious candidates. It's annoying for me, and often enough also for my teammates who I video conference a lot with. Sorry folks!
- I thought **Snaps** were really cool until I used them in my day to day. I used snaps for GitHub Desktop, Spotify, VSCode, Slack, and have since reverted all of that to using plain APT repositories. Issues ranged from intense CPU hogs, to links in Slack not opening, or always opening in a new Firefox window, seemingly random segfaults, etc. I guess it's still a bit too early and some programs don't like to be contained so much, or I'm just plain unlucky. I didn't have time to deep dive, APT works fine. *Edit: I [later](https://news.ycombinator.com/item?id=22972661) [learned](https://news.ycombinator.com/item?id=23052108) that people also dislike about Snaps due to their closed/non-standard nature.*
- There are [other ways](https://help.ubuntu.com/stable/ubuntu-help/tips-specialchars.html.en) but if I want to type an `é`, out of the box, I have to type: `<Ctrl><Shift>U` `00e9`, and then, that doesn't work in my code editor. Long-pressing a button on macOS wins! I keep forgetting this code, too, so I'm saying `Renee` a lot. Sorry `Renée`! *Edit: readers suggested to use the Compose key. I've updated CLI instructions with it and it works much better. A default mapped key and brief introduction for new users may be a good idea.*
- As [tedthetrumpet points out in the comments](#comment-4671113771), I too really miss Preview on Spacebar. *Edit: as Cory Flick mentions in the comments we can install [gnome-sushi](http://www.ubuntugeek.com/gnome-sushi-quick-previewer-for-nautilus.html) for that. Thanks!!* - *Edit2: In Groovy Sushi broke, you have to [compile it from source](https://gitlab.gnome.org/GNOME/sushi/-/issues/48#note_946265) to get it to work again*
- I upgraded to 19.10 Eoan and when I rebooted it said `error: Unknown TPM error.", followed by "error: you need to load the kernel first` and that was that. It turned out I had to [**disable TPM in my Bios**](https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1848892). To this day I haven't done the deepdive on what that even is and why it would matter but that solved it :shrug:
- In VSCode when switching tabs or scrolling, the brightness would **flicker**. It turned out I had to pass the `--disable-gpu` argument or [set `"window.titleBarStyle": "native"`](https://github.com/microsoft/vscode/issues/41513) which is a bit sad because I had custom colored titlebars per project before.
- When I upgraded to the 5.x kernel that comes with Eoan, my **Wi-Fi hardware wasn't recognized**. I reverted and it looks like I'll be stuck on the OEM kernel by Dell for a while.

<strong><sup>1</sup></strong> <small>Since I run a video encoding business, having our test suites pass faster locally ramps up my productivity near-linearly, so the company is happy to sponsor the fastest machine. I do realize I'm incredibly privileged like that. If this is something you feel you need to get in on, <a href="https://transloadit.com/jobs/">come work for us</a>.</small>

## Things I could fix

So here's what I did to make Ubuntu usable as a day-to-day workhorse, coming from Mac. Please regard it as a grabbag containing opinionated things that may very well drive you nuts, and probably don't blindly paste my entire setup. 

Most of the apps and settings in this post, you'll be able to pick from Ubuntu's graphical interfaces. However by having everything as CLI commands, _I can_ paste it all and have a new machine configured identically in minutes. So I'll be referring to this post myself and will tweak it as I go. Any changes I make to my machine that I want to persist, even if I initially make them in the GUI, I'm sure to document those back as CLI commands right here.

**Disclaimer: This worked for Ubuntu 19.04. It may not work for other versions.** If you use a more recent version and know how to make things compatible, please drop a line in the [comments below](#comments-area).

### Basics

#### Small Tools & Utilities

Install basic tools, some of which (`curl`, `dconf-cli`, `xdotool`) we'll also need to execute further steps. 

```bash
sudo apt install \
  awscli \
  curl \
  dconf-cli \
  gnome-startup-applications \
  exiftool \
  htop \
  ipcalc \
  jq \
  logtail \
  mc \
  mlocate \
  tmux \
  whois \
  xdotool \
&& true
# ^-- was dconf-tools before Ubuntu 19.04
```

#### Overwrite APT repositories in a denser way, and enable more repos

This will also unlock (but not install yet) non-free codecs for example.

```bash
release=$(lsb_release -cs)
echo "deb http://archive.ubuntu.com/ubuntu/ ${release} main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ ${release}-updates main restricted universe multiverse
deb http://archive.ubuntu.com/ubuntu/ ${release}-security main restricted universe multiverse
" | sudo tee /etc/apt/sources.list \
  && sudo apt update
```

#### Apt-file

So you can search for e.g. `apt-file search bin/aws` to find out what APT package the `aws` command belongs to again.

```bash
sudo apt install apt-file && sudo apt-file update
```

#### Browser

Trying to have less Google in my life and so I'll open up a Terminal and install Firefox as my main driver:

```bash
# This also installs some of those non-free codecs so I can watch videos online:
sudo apt install firefox ubuntu-restricted-extras
```

As a backup for whenever Google Hangouts does not work in Firefox, this installs Chrome:

```bash
curl -fsSL https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add - \
  && echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" | sudo tee /etc/apt/sources.list.d/google-chrome.list \
  && sudo apt update \
  && sudo apt install google-chrome-stable \
  && true
```

Or to check out Brave:

```bash
sudo apt install apt-transport-https curl gnupg \
  && curl -s https://brave-browser-apt-release.s3.brave.com/brave-core.asc | sudo apt-key --keyring /etc/apt/trusted.gpg.d/brave-browser-release.gpg add - \
  && echo "deb [arch=amd64] https://brave-browser-apt-release.s3.brave.com/ stable main" | sudo tee /etc/apt/sources.list.d/brave-browser-release.list \
  && sudo apt update \
  && sudo apt install brave-browser \
  && true
```

#### 1Password Password Manager

```bash
sudo apt-key --keyring /usr/share/keyrings/1password.gpg adv --keyserver keyserver.ubuntu.com --recv-keys 3FEF9748469ADBE15DA7CA80AC2D62742012EA22 \
  && echo 'deb [arch=amd64 signed-by=/usr/share/keyrings/1password.gpg] https://downloads.1password.com/linux/debian edge main' | sudo tee /etc/apt/sources.list.d/1password.list \
  && sudo apt update \
  && sudo apt install 1password \
  && true
```

#### Email

Haven't found a really neat one yet. So far using Thunderbird for work email and a web interface (yeah still Gmail) for personal. Thunderbird is fast and functional and doesn't surprise me in bad ways. That's about all the nice things that I can say about it :smile: To install it:

```bash
sudo apt install thunderbird
```

#### Spotify

Can't not have music! We'll add their own APT repository so we can enjoy regular updates. And you can read a little higher up why I try to stay away from Snaps for now, and use APT instead.

```bash
curl -fsSL https://download.spotify.com/debian/pubkey.gpg | sudo apt-key add - \
  && echo "deb http://repository.spotify.com stable non-free" | sudo tee /etc/apt/sources.list.d/spotify.list \
  && sudo apt update \
  && sudo apt install spotify-client \
  && true
```

### Collaboration

#### Slack

```bash
cd /tmp \
  && curl -fsSLo slack.deb https://downloads.slack-edge.com/linux_releases/slack-desktop-4.0.2-amd64.deb \
  && sudo apt install ./slack.deb \
  && cd -
```

#### Dropbox 

```bash
cd /tmp \
  && apt install libpango1.0-0 libpangox-1.0-0 python3-gpg \
  && curl -fsSLo dropbox.deb https://www.dropbox.com/download?dl=packages/ubuntu/dropbox_2019.02.14_amd64.deb \
  && sudo apt install ./dropbox.deb \
  && dropbox autostart y
  && true
# now type CMD+SPACE (or just CMD if you don't remap like I did below), type dropbox, ENTER
```

#### Signal

We use this with the team to transmit secrets to each other.

```bash
echo "deb [arch=amd64] https://updates.signal.org/desktop/apt xenial main" | sudo tee -a /etc/apt/sources.list \
  && curl -fsSL https://updates.signal.org/desktop/apt/keys.asc | sudo apt-key add - \
  && sudo apt update \
  && sudo apt install signal-desktop \
  && true

# Do not register Signal as the default App to open html files
# https://github.com/signalapp/Signal-Desktop/issues/3602
if xdg-mime query default text/html | grep -q signal; then
  echo "Fixing bug: Found Signal as mime type handler for text/html"
  if which firefox > /dev/null; then
    echo "Restoring handler for text/html as Firefox"
    xdg-mime default firefox.desktop text/html
  elif which chromium-browser > /dev/null; then
    echo "Restoring handler for text/html as Chromium"
    xdg-mime default chromium-browser.desktop text/html
  else
    echo "Could not find a handler, no handler set for text/html"
    xdg-mime default /bin/true text/html
  fi
fi  
```

#### Gimp

Probably not as nice as Photoshop, but I'm not a designer and for me it gets the job done.

```bash
sudo apt install gimp
```

### Development 

#### VSCode

You'll know how to replace this with your own favorite editor.

```bash
curl -fsSL https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > packages.microsoft.gpg \
  && sudo install -o root -g root -m 644 packages.microsoft.gpg /usr/share/keyrings/ \
  && sudo sh -c 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/packages.microsoft.gpg] https://packages.microsoft.com/repos/vscode stable main" > /etc/apt/sources.list.d/vscode.list' \
  && sudo apt install apt-transport-https \
  && sudo apt update \
  && sudo apt install code \
  && true
# Afterwards I install the extension Settings Sync and enter the Gist ID (df2624fb06dc2d3b8890a28d4caa3820 in my case)
# to setup VSCode to my preferences. For uploading changes to the settings, you'll need a GitHub token.
```

#### If you want to allow incoming SSH connections:

```bash
sudo apt install openssh-server
```

#### Access to code, SSH Key for github and syncing your code dir from another machine

```bash
ssh-keygen -t rsa -b 4096 -C "${USER}@${HOSTNAME}"
# Press enter on all questions, then 
cat ~/.ssh/id_rsa.pub
# and add it to https://github.com/settings/keys
# probably also add it to the machine that currently has 
# your code in ~/.ssh/authorized_keys <-- and chmod it to 600, 
# so you can sync it to your new machine via, e.g.:
rsync -a --progress --ignore-existing --exclude='node_modules/' --exclude='.Trash-*' "10.0.1.144:/${HOME}/code/" /home/kvz/code
```

#### Git

```bash
# Git LFS
cd /tmp \
  && curl -fsSLo script.sh https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh \
  && sudo bash script.sh \
  && sudo apt install git-lfs \
  && git lfs install \
  && cd -

# Avoid warning: Pulling without specifying how to reconcile divergent branches ..
git config --global pull.rebase false
```

#### Git Verified Commits

Dangerous, best not copy my setup but reference [this guide](https://gist.github.com/Beneboe/3183a8a9eb53439dbee07c90b344c77e).

```bash
# rm -rf ~/.gnupg
# Cross-platform macOs compatibility:
ln -nfs ~/Shared/Configs/gnupg ~/.gnupg
chmod 700 /home/kvz/.gnupg
sudo ln -nfs /usr/bin/gpg /usr/local/bin/gpg
sudo ln -nfs /usr/bin/pinentry-gnome3 /usr/local/bin/pinentry-mac

# List existing keys, pick key to sign commits with:
gpg --list-secret-keys --keyid-format LONG
# take short key and use as XXXXXXXX
git config --global user.signingkey XXXXXXXX
git config --global commit.gpgsign true
git config --global gpg.program /usr/local/bin/gpg
```

#### GitHub Desktop

I also still use the CLI, but for staging commits, there's no beating the GUI. Yes I did try `git add -p`. No it doesn't come close! Sorry! :smile:

```bash
# ran into multiple issues (such a segmentation faults 
# and https://github.com/shiftkey/desktop/issues/59) with the default snap, 
# So using this for now:
cd /tmp \
  && curl -fsSLo GitHubDesktop.deb https://github.com/shiftkey/desktop/releases/download/release-2.9.3-linux1/GitHubDesktop-linux-2.9.3-linux1.deb \
  && sudo apt install ./GitHubDesktop.deb \
  && cd -
# if you have GitHub 2FA and use HTTPS repositories (vs SSH) and get 
# create a personal access token on GitHub, and use that as your password
# to make it persist, set: git config --global credential.helper store # (more convenient)
#                      or: git config --global credential.helper cache # (safer)
```

#### GitHub CLI

```bash
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg \
  && sudo chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg \
  && echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
  && sudo apt update \
  && sudo apt install gh \
  && gh auth login \
  && echo -e '\neval "$(gh completion -s bash) || true"' >> ~/.bash_profile \
  && true
```

Here's how I create todos from the CLI for different projects by just typing: `gh todo-website`. You'll want to change the values here of course, but maybe it can provide some inspiration.

```bash
gh alias set todo-meta 'issue create --repo=transloadit/team-internals --project="🤖 The Board" --body="n/a" --label="meta" --assignee="@me"' \
  && gh alias set todo-founder 'issue create --repo=transloadit/founder-internals --project="🤖 The Board" --body="n/a" --label="meta" --assignee="@me"' \
  && gh alias set todo-accounting 'issue create --repo=transloadit/accounting --project="🤖 The Board" --body="n/a" --label="accounting" --assignee="@me"' \
  && gh alias set todo-legal 'issue create --repo=transloadit/legal --project="🤖 The Board" --body="n/a" --label="legal" --assignee="@me"' \
  && gh alias set todo-website 'issue create --repo=transloadit/content --project="🤖 The Board" --body="n/a" --label="website" --assignee="@me"' \
  && gh alias set todo-content 'issue create --repo=transloadit/content --project="🤖 The Board" --body="n/a" --label="content" --assignee="@me"' \
  && gh alias set todo-nix 'issue create --repo=transloadit/transloadit-api2 --project="🤖 The Board" --body="n/a" --label="nix" --assignee="@me"' \
  && gh alias set todo-api2 'issue create --repo=transloadit/transloadit-api2 --project="🤖 The Board" --body="n/a" --label="api" --assignee="@me"' \
  && gh alias set todo-growth 'issue create --repo=transloadit/growth --project="🤖 The Board" --body="n/a" --label="growth" --assignee="@me"' \
  && gh alias set todo-botty 'issue create --repo=transloadit/botty --project="🤖 The Board" --body="n/a" --label="satellite" --assignee="@me"' \
  && true
```

#### Starship & powerline fonts

This gives me a cool prompt.

```bash
cd /tmp \
  && sudo apt install fonts-powerline fonts-firacode \
  && curl -fsSLo starship-v0.15.0-x86_64-unknown-linux-gnu.tar.gz https://github.com/starship/starship/releases/download/v0.15.0/starship-v0.15.0-x86_64-unknown-linux-gnu.tar.gz \
  && tar zxvf starship-v0.15.0-x86_64-unknown-linux-gnu.tar.gz \
  && sudo cp -af ./x86_64-unknown-linux-gnu/starship /usr/local/bin/starship \
  && cd -
# Now add to your ~/.bash_profile: [ -x /usr/local/bin/starship ] && eval "$(starship init bash)"
```

#### Shellcheck

Linting for Bash scripts.

```bash
sudo apt install cabal-install && cabal update && cabal install ShellCheck
```

#### Node.js

```bash
# Install Node 16
cd /tmp \
  && sudo rm -f /etc/apt/sources.list.d/nodesource.list* \
  && curl -fsSLo setup.sh https://deb.nodesource.com/setup_16.x \
  && sudo -E bash setup.sh \
  && sudo apt install nodejs \
  && cd -

# make sure apt does not produce warnings while updating, or the Node.js repo setup will silently bail out

# Install yarn
(curl -fsSL https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add -) \
  && echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list \
  && sudo apt update \
  && sudo apt install yarn

# Log into npm if you need that
npm login
```

#### Go

```bash
sudo rm -f /etc/apt/sources.list.d/longsleep-ubuntu-golang-backports-* \
  && sudo add-apt-repository ppa:longsleep/golang-backports \
  && sudo apt update \
  && sudo apt install golang-go golang-go.tools \
  && mkdir ~/go ~/code \
  && true
```

#### Basic PHP & MySQL CLI 

```bash
sudo apt install php-cli php-mbstring php-mysql mysql-client
# and enable extension=mbstring, extension=mysqlnd, extension=mysqli in $EDITOR /etc/php/*/cli/php.ini
```

#### DataGrip

Together with 1Password, DataGrip is the only software I installed that costs money. Although I was lucky enough to be donated a license for work done on my open source project Locutus.

```bash
sudo mkdir -p /opt \
  && cd /opt \
  && curl -fsSLo datagrip.tar.gz https://download.jetbrains.com/datagrip/datagrip-2019.2.4.tar.gz \
  && tar zxvf datagrip.tar.gz \
  && cd DataGrip-2019.2.4 \
  && ./bin/datagrip.sh \
  && cd -
```

You should be able to create a launcher from within the application itself. Choose "Tools", "Create desktop entry". This will create an item in the dash (`<Ctrl><Space>`), which you can then drag into the dock.

*Edit: [mxschumacher on Hacker News](https://news.ycombinator.com/item?id=21396238) suggested to use [Jetbrain's Toolbox](https://news.ycombinator.com/item?id=21396238) instead.*

#### Vagrant & VBox 

```bash
echo "deb [arch=amd64] https://download.virtualbox.org/virtualbox/debian $(lsb_release -cs) contrib" | sudo tee -a /etc/apt/sources.list \
  && curl -fsSL https://www.virtualbox.org/download/oracle_vbox_2016.asc | sudo apt-key add - \
  && curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add - \
  && sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main" \
  && sudo apt update \
  && sudo apt install virtualbox linux-headers-$(uname -r) vagrant \
  && vagrant plugin install vagrant-vbguest \
  && cd -
```

After a kernel/distribution upgrade you'll want to run:

```bash
sudo apt reinstall virtualbox-dkms virtualbox linux-headers-$(uname -r)
```

I had to install a non-signed kernel or I would run into [an issue](https://unix.stackexchange.com/questions/455189/pkcs7-signature-not-signed-with-a-trusted-key) whenever starting VirtualBox.

#### Docker

```bash
sudo apt install \
    apt-transport-https \
    ca-certificates \
    curl \
    software-properties-common \
&& (curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -) \
&& sudo apt-key fingerprint 0EBFCD88 \
&& sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable" \
&& sudo apt update \
&& sudo apt install docker-ce \
&& sudo usermod -aG docker kvz \
&& sudo systemctl enable docker \
&& true

# Login to docker Hub if you need that
docker login

echo "WARNING! REBOOTING HARD NOW. PRESS CTRL+C IF YOU ARE NOT SURE! Sleeping 10s" && sleep 10 & sudo reboot
```

### Tweak Desktop

#### Fix copy-paste

10 years later, copy-paste still leaves a lot to be desired for people coming from other platforms. Different apps use different clipboards :scream:.There are different shortcuts to utilize the different clipboards :scream:. Clipboards reset when apps close :scream:.
Even for copying these commands from Firefox to the terminal with keyboard shortcuts (like `<Ctrl><Insert>`, `<Shift><Insert>`, as `<Ctrl>C` has a different meaning in terminals), I recommend you get this fixed already. Luckily you can, with clipboard managers.

- <https://askubuntu.com/questions/12047/inconsistent-copy-and-paste-behaviour-is-there-a-fix>
- <https://wiki.ubuntu.com/ClipboardPersistence>

As a bonus, let's also install `xclip` which nice for copying from the CLI like: `cat /etc/config |xclip`

```bash
sudo apt install parcellite libcanberra-gtk-module xclip
```

I had to `sudo reboot` before this worked then press `<Ctrl><Alt>P` to bring up the menu, open preferences, make sure box 1 is checked, box 2 is unchecked.

Some linux veterans said they don't have any issues with multiple clipboards, and even enjoy it. And they know not to close an originating app so they don't hit that problem. But I feel for newcomers the appearance of a single clipboard that persists through app lifecycles would be a sane default. Looks at Mark Shuttleworth :eyes:.

#### Add emoji support

```bash
cd /tmp \
  && sudo apt install fonts-emojione \
  && curl -fsSLo noto-color-emoj.deb https://launchpad.net/~ubuntu-desktop/+archive/ubuntu/transitions/+files/fonts-noto-color-emoji_0~20170913-0ubuntu1~bionic1_all.deb \
  && sudo apt install ./noto-color-emoj.deb \
  && cd -
# then install GNOME Characters to easily browse emoji:
sudo apt install gnome-characters
```

Some apps will need a hand, e.g. for VSCode you could set (single config shared between Ubuntu & macOS, worked for me): 

```json
"editor.fontFamily": "Menlo, 'Droid Sans Mono', , monospace, 'Droid Sans Fallback', Monaco, Consolas, 'Droid Sans Mono', 'Inconsolata', 'Courier New', 'Droid Sans Fallback', 'Noto Color Emoji', 'Apple Color Emoji'"
```

#### Move around windows or resize them by holding `<Alt>` and dragging a window

```bash
sudo apt install compizconfig-settings-manager
cssm
# Enable the Move plugin. Set Initiate Move to <Alt>Button1. 
```

Tip: the 'xev' tool lets you see what button number is associated with e.g. a two finger drag on a touchpad. You can also use this to let apps [open on certain areas of the screen](https://askubuntu.com/questions/107951/ how-to-set-a-specific-window-size-and-placement-for-all-windows-that-open-to-def) by default.

#### Better screenshots

Save screenshots in `~/Dropbox/Screenshots`. And use annotation screenshot tool by default.

```bash
sudo apt install flameshot \
  && flameshot config -f '%Y-%m-%d_%H-%M-%S-screenshot' \
  && dconf write /org/gnome/gnome-screenshot/auto-save-directory "['file:///home/${USER}/Dropbox/Screenshots']" \
  && dconf write /org/gnome/gnome-screenshot/border-effect "['shadow']"
# https://askubuntu.com/a/1039949/2222
# Release the <PrtScr> binding by this command
gsettings set org.gnome.settings-daemon.plugins.media-keys screenshot ''
# Go to Settings -> Devices -> Keyboard and scroll to the end. Press + and you will create custom shortcut.
# Enter name: "flameshot", command: `/usr/bin/flameshot gui --path /home/kvz/Dropbox/Screenshots/`. # <-- replace 'kvz' with your username, no $HOME or ~ substitution supported
# Set shortcut to <PrtScr> (print).
```

#### Disable `<PgUp>` and `<PgDn>` in VSCode

Those buttons are awesfully close to the left and right cursor, that I use a lot while coding in VSCode, so disabling/changing the behavior there was sufficient for me:

```json
  {
    "key": "ctrl+up",
    "command": "cursorPageUp",
    "when": "textInputFocus"
  },
  {
    "key": "ctrl+down",
    "command": "cursorPageDown",
    "when": "textInputFocus"
  },
  {
    "key": "pagedown",
    "command": "cursorRight",
    "when": "textInputFocus"
  },
  {
    "key": "pageup",
    "command": "cursorLeft",
    "when": "textInputFocus"
  },
```

One of those cases where retraining muscle memory took enough time to want to change the system instead.

#### Make the Apple Magick Trackpad 2 work on Ubuntu

Edit 2021-02-08: The Magic Trackpad 2 is now working, but you do need to increase finger sensitivity via `xinput`:

```bash
cat << EOF > ~/touchpad_settings.sh
#!/usr/bin/env bash
set -eux
deviceId=\$(xinput list |awk -F'id=' '/Apple Inc. Magic Trackpad/ {print \$2}' |awk '{print \$1}')
propId=\$(xinput list-props \${deviceId} |awk '/Synaptics Finger/ {print \$3}' |tr -d -c 0-9)
xinput set-prop \${deviceId} \${propId} 2, 2, 0

synclient ClickFinger3=2
synclient HorizTwoFingerScroll=1
synclient TapButton2=0
synclient TapButton1=1
EOF
chmod +x ~/touchpad_settings.sh
~/touchpad_settings.sh
gsettings set org.gnome.settings-daemon.peripherals.input-devices hotplug-command "${HOME}/touchpad_settings.sh"
```

Distilled from this [reddit thread](https://www.reddit.com/r/Ubuntu/comments/ahgbdg/apple_magic_trackpad_2_configuration_on_kubuntu/).

#### Make keyboard shortcuts, navigation, gestures, layout more like I had on macOS 

This is highly personal because even on macOS I already personalized (where I stole much from [Mathias Bynens' Dotfiles](https://github.com/mathiasbynens/dotfiles/blob/master/.macos)) so you'll probably only cherry-pick a few here.

```bash
# If you have an Apple keyboard and want the function keys to act like F1-F12 by default (disable Fn default behavior) give the following command in terminal:

echo 2 | sudo tee /sys/module/hid_apple/parameters/fnmode

# Disable many keybindings that manage windows to free them up for VSCode,
# except for a few ones I also had on macOS
dconf write /org/gnome/desktop/wm/keybindings/maximize "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-monitor-down "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-monitor-left "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-monitor-left "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-monitor-right "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-monitor-up "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-workspace-down "['<Super><Ctrl>Right','<Super><Ctrl>Down']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-workspace-left "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-workspace-right "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/move-to-workspace-up "['<Super><Ctrl>Left','<Super><Ctrl>Up']"
dconf write /org/gnome/desktop/wm/keybindings/switch-applications "['<Ctrl>Tab','<Alt>Tab']"
dconf write /org/gnome/desktop/wm/keybindings/switch-applications-backward "['<Shift><Alt>Tab','<Shift><Ctrl>Tab']"
dconf write /org/gnome/desktop/wm/keybindings/switch-group "['<Ctrl>Above_Tab','<Alt>Above_Tab']"
dconf write /org/gnome/desktop/wm/keybindings/switch-group-backward "['<Shift><Ctrl>Above_Tab','<Shift><Alt>Above_Tab']"
dconf write /org/gnome/desktop/wm/keybindings/switch-input-source "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/switch-input-source-backward "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/switch-panels "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/switch-panels-backward "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/switch-to-workspace-down "['<Super>Right','<Super>Down']"
dconf write /org/gnome/desktop/wm/keybindings/switch-to-workspace-left "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/switch-to-workspace-right "['disabled']"
dconf write /org/gnome/desktop/wm/keybindings/switch-to-workspace-up "['<Super>Left','<Super>Up']"
dconf write /org/gnome/desktop/wm/keybindings/unmaximize "['disabled']"
dconf write /org/gnome/mutter/keybindings/switch-monitor "['XF86Display']"
dconf write /org/gnome/mutter/keybindings/toggle-tiled-left "['<Super><Ctrl><Alt>Left']"
dconf write /org/gnome/mutter/keybindings/toggle-tiled-right "['<Super><Ctrl><Alt>Right']"
dconf write /org/gnome/shell/keybindings/toggle-overview "['<Super>Space','<Ctrl>Space']"
dconf write /org/gnome/desktop/input-sources/xkb-options "['caps:escape']" 
# ^-- to swap vs map both to escape: "['caps:swapescape']". 

# Update 2020-12-28: After readers reporting about the Compose
# Key I decided to try and now to type the accented letter `é`, 
# I press compose (`<rAlt>`) then `'` then `e`. Not too bad at all 
# and it takes just this one line to set up:
dconf write /org/gnome/desktop/input-sources/xkb-options "['compose:ralt']"
# The list of all the codes is in /usr/share/X11/locale/en_US.UTF-8/Compose
# and you can customize via `vim ~/.XCompose` `<Multi_key> <e> <m> <o> <s>: "✨"` then `ibus restart`

# More macOS-like tab navigation in the terminal
# Find all possible config keys via: gsettings list-recursively |grep Terminal
gsettings set org.gnome.Terminal.Legacy.Keybindings:/org/gnome/terminal/legacy/keybindings/ next-tab "<Primary>braceright"
gsettings set org.gnome.Terminal.Legacy.Keybindings:/org/gnome/terminal/legacy/keybindings/ prev-tab "<Primary>braceleft"
gsettings set org.gnome.Terminal.Legacy.Keybindings:/org/gnome/terminal/legacy/keybindings/ move-tab-left "<Primary><Shift>Left"
gsettings set org.gnome.Terminal.Legacy.Keybindings:/org/gnome/terminal/legacy/keybindings/ move-tab-right "<Primary><Shift>Right"
gsettings set org.gnome.Terminal.Legacy.Keybindings:/org/gnome/terminal/legacy/keybindings/ close-tab "<Primary>w"
gsettings set org.gnome.Terminal.Legacy.Keybindings:/org/gnome/terminal/legacy/keybindings/ new-tab "<Primary>t"
 
# Make the Dock more macOS-like
gsettings set org.gnome.shell.extensions.dash-to-dock extend-height false
gsettings set org.gnome.shell.extensions.dash-to-dock dock-position BOTTOM
gsettings set org.gnome.shell.extensions.dash-to-dock transparency-mode FIXED
gsettings set org.gnome.shell.extensions.dash-to-dock dash-max-icon-size 32
gsettings set org.gnome.shell.extensions.dash-to-dock unity-backlit-items true
# I once had all dash-to-dock icons dissappear. Resetting resolved it:
gsettings reset org.gnome.shell.extensions.dash-to-dock extend-height 
gsettings reset org.gnome.shell.extensions.dash-to-dock dock-position 
gsettings reset org.gnome.shell.extensions.dash-to-dock transparency-mode 
gsettings reset org.gnome.shell.extensions.dash-to-dock dash-max-icon-size 
gsettings reset org.gnome.shell.extensions.dash-to-dock unity-backlit-items 

# Activate Gnome Activities Overview on hot corner <-- careful you may find this annoying
gsettings set org.gnome.shell enable-hot-corners true
# As of Ubuntu 19.10 this is:
gsettings set org.gnome.desktop.interface enable-hot-corners true

# (as of 19.10) Dark tabs for the Terminal to me look better regardless of the system theme
gsettings set org.gnome.Terminal.Legacy.Settings theme-variant "dark"

# Disable left super overview, bind to Super Up (use <Alt>F1 or hot corner)
gsettings set org.gnome.mutter overlay-key ""

# <Alt>left click to move windows (without dragging the titlebar)
gsettings set org.gnome.desktop.wm.preferences mouse-button-modifier "<Alt>"
# <Alt>right click to resize windows (without dragging the titlebar)
gsettings set org.gnome.desktop.wm.preferences resize-with-right-button true

# Disable <Alt><Ctrl>S minimizing windows (and freeing it up for VsCode Save-All)
gsettings set org.gnome.desktop.wm.keybindings toggle-shaded "['disabled']"

# Still click an app in the dock to open, but if it's open already, this makes a click minimize it
gsettings set org.gnome.shell.extensions.dash-to-dock click-action "minimize"

# Show weekday in clock
gsettings set org.gnome.desktop.interface clock-show-weekday true

# Disable <Alt> showing the menu/HUD, making windows very jumpy whenever you press a <Alt> involved key combo
gsettings set org.compiz.integrated show-hud "['']" # <-- No longer works on Ubuntu 19.10, does anybody know how to fix this?

# Move trash can from desktop to dock
gsettings set org.gnome.shell.extensions.dash-to-dock show-trash true
gsettings set org.gnome.shell.extensions.desktop-icons show-trash false

# Show directories above files
dconf write /org/gtk/settings/file-chooser/sort-directories-first true

# Enable Experimental Fractional Scaling 
# This allows you to scale laptop screens in more finer grained steps than just 100%, 200%. After running this you can go to the display settings
# and choose: 125%, 150%, etc. On my 13" XPS, 125% looks much better.
# WARNING, CAN CAUSE LOAD/INSTABILITY
gsettings set org.gnome.mutter experimental-features "['x11-randr-fractional-scaling']"
```

You can list more options via e.g.:

```bash
gsettings list-recursively org.gnome.Terminal
gsettings list-recursively org.gnome.shell
gsettings list-recursively org.gnome.desktop.interface
```

You can reset a dconf setting via e.g.:

```bash
# Bring back capslock/escape mapping to default behavior
dconf reset /org/gnome/desktop/input-sources/xkb-options
```

You can explore more settings visually via:

```bash
sudo apt install dconf-editor
dconf-editor
```

#### Make `open` work

We're using `open` a lot in automation written orginally for macOS, in places where (bash) aliases aren't always available, so I symlinked it:

```bash
sudo ln -nfs /usr/bin/xdg-open /usr/bin/open
```

#### Bug workarounds / fixes

Remove ocra screenreader which was doing 100% CPU and making the system laggy and disable screenreader activating on what used to be Save-All in VSCode on macOS.

```bash
killall -9 orca \
  ; sudo apt purge orca \
  ; gsettings set org.gnome.settings-daemon.plugins.media-keys screenreader "['disabled']"

# Fix ENOSPC when watching many files
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p

# Remove Amazon launcher
sudo rm /usr/share/applications/ubuntu-amazon-default.desktop 
```

#### How to enable all APT repos again after upgrading to a new release

First remove old backed up sources lists:

```bash
sudo rm -f /etc/apt/sources.{list,list.d/}*{~,.save,.distUpgrade}
```

Then switch releases (e.g. rom `hirsute` to `impish`) and uncomment the repos:

```bash
for f in /etc/apt/sources.{list,list.d/}*; do 
  [ -d "${f}" ] && continue
  echo "${f}"
  sudo sed -i \
    -e 's/hirsute/impish/g'\
    -e 's/^# \(.*\) # disabled on upgrade to.*/\1/g' \
  "${f}"
done
```

Not all 3rd party vendors will have repos for each Ubuntu release. The release may be too new, or the vendor may only support LTS releases while your upgrade may not be. To counter, now run:

```bash
sudo apt update
```

Find all the 404s, and change them back to the original release (e.g. `focal`) by editting the files inside `/etc/apt`, and re-running `apt update` until there are no more errors.

### Games

#### Latest NVIDIA drivers

From <https://github.com/lutris/docs/blob/master/InstallingDrivers.md> which has warnings and compatibility disclaimer, read them.

```bash
sudo add-apt-repository ppa:graphics-drivers/ppa \
  && sudo dpkg --add-architecture i386  \
  && sudo apt update \
  && sudo apt install nvidia-driver-440 libnvidia-gl-440 libnvidia-gl-440:i386 \
  && sudo apt install libvulkan1 libvulkan1:i386 \
  && true
```

#### Lutris

```bash
sudo add-apt-repository ppa:lutris-team/lutris \
  && sudo apt-get update \
  && sudo apt-get install lutris \
  && true
```

#### Steam

```bash
# Steam
cd /tmp \
  && curl -fsSLo steam.deb https://steamcdn-a.akamaihd.net/client/installer/steam.deb \
  && sudo apt install ./steam.deb \
  && cd -
```

#### Age of Empires II

Start Steam, login, go to settings, advanced, check: Enable Steam Play for all Other titles. Latest Proto. Restart Steam. Now you can Install [AOEII](https://store.steampowered.com/app/221380/Age_of_Empires_II_2013/) which is vital software to our remote company :)

#### StarCraft II

Parts taken and modified from <https://www.reddit.com/r/starcraft/comments/5w0wyv/how_to_play_sc2_on_linux_a_full_walk_through/>. Download the Battle.net installer <https://www.blizzard.com/en-us/download/> and make it executable via: 

```bash
chmod 755 ~/Downloads/Battle.net-Setup.exe
```

Get the latest Wine, and install ttf-mscorefonts-installer without which Battle.net will crash.

```bash
sudo dpkg --add-architecture i386 \
  && wget -O - https://dl.winehq.org/wine-builds/winehq.key | sudo apt-key add - \
  && sudo add-apt-repository "deb https://dl.winehq.org/wine-builds/ubuntu/ $(lsb_release -cs) main"
  && sudo apt install ttf-mscorefonts-installer \
  && sudo apt install wine \
  && sudo apt dist-upgrade \
  true
```

```bash
# Create a 32bit env:
WINEARCH=win32 WINEPREFIX=~/.wine32 winecfg 
# Now go to libraries, and Add each of these:
# - api-ms-win-crt-math-l1-1-0
# - api-ms-win-crt-stdio-l1-1-0
# - dgbhelp
# - msvcp140
# - ucrtbase
# - vcruntime140
# Run installer
WINEARCH=win32 WINEPREFIX=~/.wine32 vblank_mode=0 wine ~/Downloads/Battle.net-Setup.exe
# Play 
WINEARCH=win32 WINEPREFIX=~/.wine32 WINEDEBUG=-all vblank_mode=0 wine .wine/drive_c/Program\ Files/Battle.net/Battle.net.exe
```

### FAQ

#### Why `&& true` after that APT command? 

It's a noop that lets me put a `\` after each line (vs all lines but the last) which lets me more easily add/re-order lines. It's much like enjoying trailing commas for all the elements of an array. If you still think that's just too weird, I'm okay with that :smile:

The reason I'm using `&&` chaining at all is that a) programs like APT will read the STDIN and hence empty half of your copy-paste buffer and b) I want all commands to abort at the first sign of trouble.

#### Why didn't you just get an old Mac Book

Each new generation of hardware that allows our video encoding test suites to pass quicker, ramps up my productivity near-linearly.

#### Why didn't you get a Mac/Desktop

I need to sometimes pick up my machine to work at my co-founder/etc.

#### Why didn't you switch to Hackintosh

Now that's a good question. I didn't even properly look into it as I guess I feel Apple could at any time pull the plug on that, whether intentionally or no (by no longer including some driver needed for my non-apple-hardware, deploying some new kind of hardware-based signing, etc). Drop a line in the comments if I'm totally wrong about this? 

### Conclusion

If this makes you scared of trying something similar, good. Switching OSes is not for the faint-hearted. Especially to Linux. I read a [Hacker News comment](https://news.ycombinator.com/item?id=21303035) saying that 

> In a hotel (=on Mac), everything is stylish and cared for, but you have very little freedom to change things. At home (=on Linux), you need to do the dishes yourself but there's no external agenda. It's simply yours.

And even though there's enough to disagree with in analogy, it still resonated. I may still very well keep visiting hotels, I may very well buy Apple's next thing, but having the option now vs basically being forced is liberating.

As you noticed I have plenty of complaints, but no regrets about adding Linux to the mix. Two weeks in, allowing time to tweak to my habits/taste, I was feeling more productive than I was before. Big contributors there are: the OS being robust and snappy, not notifying me about the world as much, and it being finetuned to my routines more than macOS would encourage. Having GNU tools vs the minimalistic BSD versions helps, VM-less/faster Docker helps, and having APT to install all the things in a heartbeat is a godsent. 

And say what you will about [Electron](https://electronjs.org/) being a memory hungry beast and so on, and yes that's true and if everything was written in Rust that'd be nice. But it is what allows me to make this jump now, at all. 

<center>
<blockquote class="twitter-tweet"><p lang="en" dir="ltr">On the one hand, yes Electron eats a lot of RAM and native apps are more efficient, BUT you can buy 16GB for $100 (and that will only get cheaper/more abundant), and we do finally get to have nice things on Linux, which is new and exciting.<a href="https://twitter.com/hashtag/vscode?src=hash&amp;ref_src=twsrc%5Etfw">#vscode</a> <a href="https://twitter.com/hashtag/slack?src=hash&amp;ref_src=twsrc%5Etfw">#slack</a></p>&mdash; Kev van Zonneveld (@kvz) <a href="https://twitter.com/kvz/status/1169255826298683392?ref_src=twsrc%5Etfw">September 4, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></center>

Pretty much all my day-to-day tools are cross-platform now thanks to it, so I really even have the freedom to try Windows too. Although I was deep in Linux land when Ballmer said 'Linux is a cancer' and such, and so it will probably still take a few more years of Microsoft good-doing before I'm emotionally ready for something like that. [Not having GPU/CUDA support](https://github.com/microsoft/WSL/issues/3789) in the 'Windows Subsystem for Linux', and seeing that Windows now offers [ads in the start menu](https://www.reddit.com/r/assholedesign/comments/76s2fq/windows_10_puts_ads_inside_the_start_menu/), as well as being a [surveillance station](https://www.wired.com/story/windows-10-privacy-settings/) by default, all don't help I'm afraid!

I'd love to hear what you think about all of this. Maybe I'm seeing it all wrong. I'd also like to hear what other tools you install to make your Linux box _just right_. I regard this post as a Work In Progress so it'll evolve with your feedback. Leave a note in the [comments below](#comments-area), on [Twitter](https://twitter.com/kvz/status/1189475748899409920), [Hacker News](https://news.ycombinator.com/item?id=21395734), [lobste.rs](https://lobste.rs/s/z0t144/going_from_macos_ubuntu) or [Reddit](https://www.reddit.com/r/programming/comments/dp4nod/going_from_macos_to_ubuntu/).

**Update 2019-12-03**: Word is getting out that the 16" MBP that Apple just released addresses most painpoints, and has a working keyboard. By the time budget allows again to get a new machine.. Who knows.. I'm tempted.. I miss using Things! But I'm also getting into this Linux groove now. Perhaps a Dual boot? Does that work on a Mac? We'll see! Will update the post if something changes.]]></content:encoded>
      <dc:date>2019-10-30T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Freewrite</title>
      <link>https://kvz.io/freewrite.html</link>
      <description><![CDATA[Today I've unwrapped an Austrohaus Freewrite. Half a year ago I was feeling a bit overworked. I'm privileged in that I love my work a lot - so much that if I don't restrain myself it can eclipse the private life. As a result I was looking for new hobbies. Hobbies that wouldn't involve computers so much. At the same time, I enjoy myself the most when I'm doing something useful-ish.
]]></description>
      <pubDate>Tue, 05 Sep 2017 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/freewrite.html</guid>
      <content:encoded><![CDATA[Today I've unwrapped an [Austrohaus Freewrite](https://getfreewrite.com/). Half a year ago I was feeling a bit overworked. I'm privileged in that I love[ my work](https://transloadit.com) a lot - so much that if I don't restrain myself it can eclipse the private life. As a result I was looking for new hobbies. Hobbies that wouldn't involve computers so much. At the same time, I enjoy myself the most when I'm doing something useful-ish. 

I thought I'd really enjoy writing more. However, sitting behind a laptop it's easy to fall into the trap of checking email, Slack, or doing research. Before I know it I'm lost on Wikipedia, or Facebook. Those activities have their time and place but I didn't want them leaking into my hobby. I wanted to enjoy some quiet time with just my brain.

You're thinking: "So, use pen and paper". They too, have their moments, but I'm so spoiled by tech that I can't stand the inefficiency of my deplorable handwriting, nor the thought of having to transfer that writing to my blog somehow. I wanted to remove barriers to encourage writing - not add new ones.

<!--more-->

So half a year ago I started searching for a low-tech, low-distraction typing machine that _could_ sync with my computer somehow. There are apps that disable distractions, but my brain knows how to disable the app and it will just become a battle of restraint again. I thought of getting a cheap laptop and strip it from GUIs, webbrowsers, etc. Just a command-line editor and Git/Dropbox. But it'd be so tempting to install more and more tools on it. It's a computer. It's hard to look at it without seeing all the lost potential. Then I found out about the Freewrite. It was exactly what I was looking for: a typewriter with an e-ink screen and Wi-Fi. The Wi-Fi could only be used to sync to Dropbox-like-services.

<div class="kodak-container kodak-dropshadow">
  <img src="/assets/images/posts/freewrite.png">
</div>

I ordered it on a bit of an impulse half a year ago. They had construction delays and struggled with high demand, and it now only arrived today. 

This is the first post I write on it. First impression: I don't think it's particularly pretty. It's expensive. It doesn't have a cursor: make a mistake one paragraph up, and it's _Backspace_ all the way to there.

That last one is tough to swallow. On the Freewrite website they say it's a feature. Something along the lines of: "Writers never had the ability to do this. Knowing that 'undo' is hard, makes you more mindful and deliberate about what you type next, it encourages deeper thought. Not going back and forth keeps you in a creative flow. Just use your computer for the final editing in your Dropbox".

Okay, I guess I'm convinced. Good marketing right there.

So what else is good about the Freewrite?

Well I wrote this blog post, and I can say it was a relaxing experience. It was painless to set up, the keyboard feels nice & retro, and not once did I (have to resist an urge to) switch to something distracting. When I sit down behind it: I'm writing. I commit to the act, until the brain has flushed all of its thoughts and the post is finished in raw form. Before I publish I take a real computer and fix errors, move bits around, try to remove redundant words.

I am cautiously optimistic that it will encourage me to write more. I hope I will, otherwise at `$419` this has been one hell of an expensive blog post, as well as probably the most expensive way I've ever kid myself :smile: 

Guess we'll just have to keep an eye on this blog to know for sure :ok_hand:

> I suppose this can be regarded as a product review. To be clear: I've never taken any money or other rewards to say anything on this blog, I never will, and this is no exception.

**Update 2018-09-25:** While I've written some rough thoughts on the Freewrite, and also had a lot of fun typing on it as a toy with my daughter, I haven't released a single blog post since this adventure, so definitely an expensive way to fool myself :)
]]></content:encoded>
      <dc:date>2017-09-05T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>How (not) to become a programmer</title>
      <link>https://kvz.io/how-not-to-become-a-programmer.html</link>
      <description><![CDATA[I recently answered a question on Quora:
]]></description>
      <pubDate>Sun, 03 Sep 2017 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/how-not-to-become-a-programmer.html</guid>
      <content:encoded><![CDATA[I recently answered a [question on Quora](https://www.quora.com/Is-it-true-that-programming-is-not-for-everyone/answer/Kevin-van-Zonneveld): 

> Is it true that programming is not for everyone?

And thought I'd elaborate my take on this here.. 

<!--more-->

It is my take that any human brain with the capability to learn has the capability to learn programming. So yes, if you're reading this, you either already can or could likely do it.

Some people point out requirements like being a silent thinker or someone with a lot of patience or a theoretically gifted mind. Those might help to excel, but I've enjoyed programming since the age of 9, yet I don't regard myself particularly blessed with these traits. 

I _have_ always loved building things, and that got me hooked on the feedback loop of making incremental changes and slowly seeing abstract thoughts transform into something fun or valuable. Crafting something out of nothing. This appreciation, to me, is the main thing that sets potential programmers apart from others. I believe however that if you treat your brain as a simple animal that needs its rewards to learn a new trick, anybody could grow this appreciation.

My nephew got a taste of it in [Scratch](https://scratch.mit.edu/), and is now taking his first steps in Python and reading all he can about it. It's hard for him as most great resources to learn programming aren't [available in Dutch](https://twitter.com/kvz/status/802919671611719680) and he's not proficient in English yet. But he has witnessed his ideas come to life and beat me in his own simple reaction-time-based video game, and this provides him with enough fuel to make rest of the journey effortless. His parents restrict computer-time (probably a good thing?), and so he's often found plotting his next code on paper :scream:.

On the other hand I have had very capable friends in their 20s that would try to get into programming with the prospect of job security, and abandon it because they could not get into that grove. I believe in most cases a big contributor to this, was an initial focus on the wrong thing: the perceived destination. They did not allow themselves to meander into seemingly meaningless directions, even though that might have been more fun.

To take your first steps, my advice would be to make near-instant gratification your priority. Spend time trying to find a project that has a good chance of establishing a rewarding feedback loop. Start by building your own personal website, later adding some JavaScript to it. Or a simple quiz or a tool that just decides what's for diner tonight. These projects are visually and socially stimulating, as you can immediately witness results and proudly show it to your loved ones, or the world :earth_asia:.

If you could enjoy inventing things with _LEGO® Technic_ back in the day, you'll likely quickly be able to tap into that intrinsic motivation. To those having harder time to hook programming into their brain's [reward system](https://en.wikipedia.org/wiki/Reward_system), try to focus your learning efforts on establishing _just that_ initially, and the rest will follow naturally.

Just one opinion based on one person's experience. What's yours? (comments welcome on HN or Reddit)
]]></content:encoded>
      <dc:date>2017-09-03T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Introducing Lanyon</title>
      <link>https://kvz.io/introducing-lanyon.html</link>
      <description><![CDATA[
  This project also has its own homepage at lanyon.io.

]]></description>
      <pubDate>Wed, 04 Jan 2017 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/introducing-lanyon.html</guid>
      <content:encoded><![CDATA[> This project also has its own homepage at [lanyon.io](https://lanyon.io).

Lanyon is a static site generator. It functions as a wrapper around Jekyll, Webpack, and BrowserSync, in an attempt to give you the best of all worlds. Lanyon allows you to build and refresh assets instantly and offers fast and reliable file watching. 
Breaking with the traditional unix philosophy, Lanyon tries to do many things. Things such as spell checking or setting up a working Ruby environment. One might, of course, argue that Lanyon is applying unix philosophy after all, by restricting its underlying tools do perform only a single task. :thinking:

Whichever the case, Lanyon is certainly okay with embracing any philosophy, as long as it amounts to THE highest level of convenience when it comes to building static websites. Getting started with Lanyon should be as simple as `npm install lanyon --save` and `npm start`.

## State

Lanyon is currently in pre-alpha. We are still making many changes and – in keeping with SemVer tradition – are allowing ourselves to make breaking ones in `<1`. For that and other reasons, we do not recommend using it for anything serious yet.

## Used by

Lanyon is authored by people at [Transloadit](https://transloadit.com), where it already powers their website and most of their pet-projects:

- <https://transloadit.com>
- <https://tus.io>
- <https://transloadify.io>
- <https://freyproject.io>
- <https://bash3boilerplate.sh>
- <https://lanyon.io> :tada: surprise!

If you are an early adopter of Lanyon, [let us know](https://github.com/kvz/lanyon/issues/new) and get listed! :heart:

## Background

Jekyll is great for documentation and static websites sites, its ecosystem is vast and mature, things that are straightforward in Jekyll often require odd workarounds in Hexo or Hugo. Apart from that, backing from GitHub (Pages) isn't the worst thing to have either. With that in mind, we can assume that whatever we invest in Jekyll today, will be relevant for a few years to come. 

Admittedly, the other generators can also be very appealing because they humiliate Jekyll in certain areas, such as file watching, asset building, speed, browser integration, and ease of install. 

Here is an opinionated overview:


| Quality                                      |        Hugo        |        Hexo        |             Jekyll              |       Lanyon       |  Webpack/BrowserSync/Nodemon   |
|:---------------------------------------------|:------------------:|:------------------:|:-------------------------------:|:------------------:|:------------------------------:|
| Easy to maintain many documents              | :white_check_mark: |                    | :white_check_mark::arrow_right: | :white_check_mark: |                                |
| Great templating engine                      |                    |                    | :white_check_mark::arrow_right: | :white_check_mark: |                                |
| Vast and mature ecosystem                    |                    |                    | :white_check_mark::arrow_right: | :white_check_mark: | :arrow_left::white_check_mark: |
| Easy to get help                             | :white_check_mark: |                    | :white_check_mark::arrow_right: | :white_check_mark: | :arrow_left::white_check_mark: |
| Backed by GitHub                             |                    |                    | :white_check_mark::arrow_right: | :white_check_mark: |                                |
| Easy to install                              | :white_check_mark: | :white_check_mark: |                                 | :white_check_mark: | :arrow_left::white_check_mark: |
| Browser integration for content reloads      |                    |                    |                                 | :white_check_mark: | :arrow_left::white_check_mark: |
| Fast asset building                          |                    | :white_check_mark: |                                 | :white_check_mark: | :arrow_left::white_check_mark: |
| Fast and robust filewatching                 |                    |                    |                                 | :white_check_mark: | :arrow_left::white_check_mark: |
| HMR / immediate in-browser asset refreshment |                    |                    |                                 | :white_check_mark: | :arrow_left::white_check_mark: |

What we set out to do with Lanyon, is to get the best of all worlds. We are doing so by:

- Taking a sledge hammer :hammer: approach towards getting a suitable version of Ruby to work on your system. Lanyon traverses [Docker](https://www.docker.com), [rbenv](https://github.com/rbenv/rbenv), [RVM](https://rvm.io), and [Homebrew](https://brew.sh), and takes the first method that provides a working Ruby 2 install. All other dependencies are then installed locally in the `.lanyon`, relieving any installation pains. 
- Using Browsersync with Webpack middleware, featuring Hot Module Reloading for stylesheets and JavaScript.
- Using Nodemon for `.md` / `.html` file-watching, while kicking incremental Jekyll builds for content.

This enables you to have locally refreshing assets in real time (e.g. in-browser font size changes as you save without the page reloading), and have much more reliable and performant content watching than Jekyll offers. It also gives us libsass (vs Ruby sass), and can sync browsers on many devices in your office so that they will follow along with what you are doing on your main workstation. Even just connecting your phone in this fashion goes a long way in spotting responsive issues quickly. This is a luxury you might normally not be able to afford for your projects, but now the tech to do this works right out of the box with just a single `npm install`!

Lanyon is geared towards developer convenience and, as a bonus, offers:

- Deploys to GitHub Pages from Travis CI or your workstation (we are not compatible with as-is `gh-pages` branch-filling, and have no desire to support that)
- JS linting (WIP)
- Markdown linting (WIP)
- Spell checking (WIP)

Lanyon is used by [Transloadit](https://transloadit.com) for static sites, and focuses on their use-case. Trying to get so many moving parts to behave as they should comes with challenges and a ton of configuration options. Lanyon won't support everything that its underlying components has to offer. Lanyon prefers convention over configuration.

We will be assuming:

- Sass
- ES6
- Assets in `./assets/`, with transpiled assets in `./assets/build/`
- Node modules in `./node_modules/`, Bower components in `./assets/bower_components/` (if any)
- Our users already have a working Node.js setup and don't mind a `package.json` in their project
- GitHub pages for deploys (with Travis CI as a builder)
- To simplify things, any environment other than `development` means `production`. If you have additional stages like `test`, you will likely want to test as close to production as possible anyway.

If you are thinking about submitting PRs for other features/flexibility, please get in touch first as we might not be on board.

If, however, there happens to be an overlap with your use case and you can live with our constraints, here is how you get started with Lanyon:

## Install

```bash
npm install lanyon --save 
```

## Use

The recommended way to use Lanyon is to add it to your project's npm scripts, in your `package.json`, add:

```javascript
...
"lanyon": {
  "entries": [
    // As single 'app' entry is the default. 
    // List all entries here if you have more
    "app"
  ],
  "gems": {
    // If you require custom gems
    "liquid_pluralize": "1.0.2"
  }
},
"scripts": {
  "install": "bower install && lanyon postinstall",
  "build": "lanyon build",
  "build:emoji": "lanyon build:emoji",
  "build:production": "LANYON_ENV=production lanyon build",
  "serve": "lanyon serve",
  "serve:production": "LANYON_ENV=production lanyon serve",
  "start": "lanyon start",
  "start:production": "npm run build:production && npm run serve:production",
  "encrypt": "lanyon encrypt",
  "deploy": "lanyon deploy"
},
...
```

If you make changes to your gems later on, re-run `npm install` to re-trigger a build.

Have an `assets/app.js` in which you require both javascripts and stylesheets:

```javascript
require('./main.js') // <-- your original sources, as many as you like
require('./style.css') // <-- yes, we also require (s)css. This is a Webpack thing

// Enable Hot Module reloading:
if (module.hot) {
  module.hot.accept('./main.js', function () {
    require('./main.js')
  })
  module.hot.accept('./style.css', function () {
    require('./style.css')
  })
}
```

**Note** You do not have to create your own `app.css` stylesheet entry-point. 
You are supposed to require CSS in `app.js`, which will then be written out by Lanyon
to `app.css` in production (and live in Webpack memory during development). :scream:

Include the build in your layout. The same location works both for production artifact files, and magic Hot Module Reloading during development.


```html
<head>
  <title>No hassle</title> 
  <link rel="stylesheet" href="{{site.lanyon_assets.app.css}}"> 
</head>
<body> ... </body>
<script src="{{site.lanyon_assets.app.js}}"></script>
```


**Note** Lanyon provides the magic `lanyon_assets` variable in Jekyll, pointing to either `/assets/build/app.js` in development, or `/assets/build/app.bfcebf1c103b9f8d41bd.js` in production, so that you can enable long-term caching of assets and also cache-bust them whenever they change. This works for all entries and asset types, and thus also for e.g. `common.css`.

Afterwards, type `npm start`. This will kick a build, spin up file watching and a browser with HMR asset reloading enabled. For more inspiration, check out the `example` folder in the Lanyon repository. The Lanyon website is also bundled under `website`, this is a little bit more advanced as it builds from the `README.md` and other Markdown files in the repo. This means there is no separate content to maintain on <https://lanyon.io>.

## Reading List

These articles where helpful in creating Lanyon

- <https://github.com/petehunt/webpack-howto/blob/master/README.md#8-optimizing-common-code>
- <https://www.jonathan-petitcolas.com/2016/08/12/plugging-webpack-to-jekyll-powered-pages.html>
- <https://webpack.github.io/docs/configuration.html#resolve-alias>
- <https://github.com/HenrikJoreteg/hjs-webpack>
- <https://webpack.github.io/docs/webpack-dev-middleware.html>
- <https://stackoverflow.com/a/28989476/151666>
- <https://github.com/webpack/webpack-dev-server/issues/97#issuecomment-70388180>
- <https://webpack.github.io/docs/hot-module-replacement.html>
- <https://github.com/css-modules/webpack-demo/issues/8#issuecomment-133922019>
- <https://github.com/gowravshekar/font-awesome-webpack>
- <https://webpack.github.io/docs/code-splitting.html#split-app-and-vendor-code>
- <https://medium.com/@okonetchnikov/long-term-caching-of-static-assets-with-webpack-1ecb139adb95#.w0elv8n7o>
- <https://webpack.github.io/docs/optimization.html>

## Concluding

If you find yourself wanting to play around with static websites and much of what Lanyon aims to do resonates, I'd love for you to kick its tires and get me some early feedback!
]]></content:encoded>
      <dc:date>2017-01-04T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>The Universal Makefile for JavaScript</title>
      <link>https://kvz.io/a-universal-makefile-for-javascript.html</link>
      <description><![CDATA[
  TL;DR The world is moving from Gulp and Grunt towards npm scripts. This seems like a place we could stay for a long time. But there's one shortcoming with npm scripts, especially when compared to Makefile: it's a sub-optimal command-line experience. So I wrote Fakefile: a universal Makefile that you can save into any Node project to offer your npm scripts as Makefile targets. This makes operating npm scripts ten times faster, and offers a polite language agnostic way into your project to people coming from non-js backgrounds. Just type npm install fakefile and profit instantly.

]]></description>
      <pubDate>Thu, 18 Feb 2016 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/a-universal-makefile-for-javascript.html</guid>
      <content:encoded><![CDATA[> **TL;DR** The world is moving from Gulp and Grunt towards npm scripts. This seems like a place we could stay for a long time. But there's one shortcoming with npm scripts, especially when compared to Makefile: it's a sub-optimal command-line experience. So I wrote [Fakefile](https://www.npmjs.com/package/fakefile): a universal Makefile that you can save into any Node project to offer your npm scripts as Makefile targets. This makes operating npm scripts ten times faster, and offers a polite language agnostic way into your project to people coming from non-js backgrounds. Just type `npm install fakefile` and profit instantly.

With ES6, JSX, linting, bundling, asset building, testing and deploying, task automation has become an important part of any JavaScript project. It is, however, also a topic 
capable of causing headaches.

Just when you managed to codify all your projects' tasks into 
[Grunt](https://gruntjs.com/), you were tempted by 
[Gulp](https://gulpjs.com/). Then 
[Broccoli.js](https://broccolijs.com/) came along to seduce you and now you are also considering
[Mimosa](https://mimosa.io/). 

By picking one, you risk alienating contributors who picked another. It is a safe assumption that the investment of buying into Gulp or another JS task running ecosystem today, will vaporize in large part before the year is over, when the frontrunners of the community start settling on something shinier. Just like with the struggle of selecting the best bundler or framework, this can lead to [choice anxiety](https://www.youtube.com/watch?v=1bqMY82xzWo) and [JavaScript fatigue](https://medium.com/@ericclemmons/javascript-fatigue-48d4011b6fc4). 

I have invested in both Grunt and Gulp for a few projects and noticed that people coming from Gulp avoided touching Gruntfiles. I would assume Broccoli folks will tend to avoid dealing with Gulpfiles. 

Another thing I noticed happening was that complexity would creep up, resulting in these files quickly turning into over-engineered and untestable behemoths, overly susceptible to [bit rot](https://en.wikipedia.org/wiki/Software_rot).

I started to long for something simpler, something not so easily affected by the ever changing winds of JavaScript fashion. Before JavaScript task runners, I primarily used Makefiles for task automation and I considered revisiting that solution. 

## Longing for Makefile

Makefiles are primarily used for building software on unix, but they are also great task runners. They aren't tied to any particular language and offer developers coming from a wide variety of languages a *polite* way into your project. Makefile snippets can be shared across repositories in Bash, JavaScript, Ruby and so forth. Visit a project, type `make start`, `make build`, `make test` or `make deploy` and the expected thing will happen, no matter what language the project was written in.

[Autocompletion](https://davidalger.com/development/bash-completion-on-os-x-with-brew/) offered by Makefiles is instant. Press `<TAB>` and you'll quickly learn about the less obvious command-line features that the project offers. Discovering new projects and revisiting old ones becomes much more enjoyable when you don't have to brush up on the docs or rely on your meat-brain's fading memory. Autocomplete is like a torch, lighting your way as you're venturing deeper into the crypts of a project. A minimalistic form of documentation that is less likely to be outdated than the actual one and it's right there to assist you when you need it without context switching.

Makefiles do have their downsides though. For one, the learning curve is steep. Even the [finest of  tutorials](https://gist.github.com/isaacs/62a2d1825d04437c6f08) can take an entire rainy Sunday afternoon to digest. From recently starting a new open source project, [Uppy](https://uppy.io), I learned that especially front-enders, as well as developers coming from non-unix backgrounds, actually regard the choice for Makefiles to be *impolite*. They may not have spent that rainy Sunday afternoon. They may not even have `make` installed and it's a drag to get that to happen on Windows.

It made me wonder if I really wanted to introduce a barrier of entry like that.

## We tried npm scripts

We had been 
[reading about](https://substack.net/task_automation_with_npm_run) people successfully
[using npm scripts](https://medium.com/@housecor/why-i-left-gulp-and-grunt-for-npm-scripts-3d6853dd22b8#.io9vwkqhh) for task automation and decided to kick the tires on that. There turned out to be many advantages.

It is safe to assume that npm scripts will be around for as long as Node.js will be and that your npm scripts will be able to run on **any platform** that Node.js runs on.

It's convenient that if you type `eslint` in an npm script, npm will automatically look whether you have this package as a local dependency in `./node_modules` and run that before checking any global installs. Not having to require or type the paths to your modules keep the tasks themselves **compact**.

Your tasks, as well as anything they rely on, are codified in a single `package.json`. This makes locking down dependencies as convenient as it is potent, increasing the likelihood of your tasks still working correctly years from now. More and more packages allow defining their configuration inside `package.json` as well, as opposed to inside their own `rc` files. This can further help to **reduce the number of moving parts**.

For bigger task, we can either write separate Node modules, or shell out to `./scripts`. I have found that carefully crafting your tasks and utilizing the **vast ecosystem** that is npm, helps to avoid this becoming a common thing. Another advantage of plugging in directly to npm's ecosystem is that we don't have to wait on a `gulp-postcss` plugin to become available before we can try `postcss`.

For instance, the things I thought I would miss are compensated by Node modules like [nodemon](https://nodemon.io/) for file watching and [parallelshell](https://github.com/keithamus/parallelshell) for parallelizing tasks without blindly sending them to the background. 

Task runners should support interdependent tasks to avoid duplication. In npm you can have scripts call one another via a simple `npm run <the other script>`. To tweak their behavior you can use environment variables, preferably via [cross-env](https://www.npmjs.com/package/cross-env). For example:

```json
"build:plugins": "cross-env BUNDLE=plugins npm run minify",
"minify": "uglify --output=${BUNDLE:-app}.min.js ${BUNDLE:-app}.js",
```

The `:-` lets us specify a default value, so that if I run `npm run minify` on the command-line without setting any environment, `app` is chosen as a default minify output. Running `npm run build:plugins`, overrides this defaults and minifies the `plugins` bundle for me.

For a larger example, check out Google's [Addy Osmani](https://twitter.com/addyosmani)'s [Gist](https://gist.github.com/addyosmani/9f10c555e32a8d06ddb0) that demonstrates his real world task runner in npm scripts.

## The Good

I ported a few of our larger Makefiles and Gulpfiles over to npm scripts.

Compared to the other JS based task runners I tried, npm scripts has many advantages. Not only are the tasks codified terser, but we can also profit from all of npm's ecosystem, and it seems a safe bet that our runner will be supported for as long as Node.js is.

Compared to Makefiles, I'm happy that we won't be scaring away any collaborators using Windows. Since it's all just JSON and JavaScript, we can expect people to start contributing to our tasks as well as our code, whereas a Makefile, in JS projects, would probably be left for the original author to patch up.

## The Bad

You miss out on a bit of power that Makefiles or JS task runners can offer, but that seems to be an acceptable trade-off so far.

Where npm scripts did fell short for me, was in command-line operating. While I like that we're no longer alienating Windows or front-end developers, I also don't want to turn my back on unix developers slightly lower on JavaScript-fu, to many of whom `make test` will still be more intuitive than `npm run test`. They'll surely figure it out if they badly want to contribute, but I set out to lower barriers and optimize for developer happiness.

Autocomplete poses another command-line related grudge. npm scripts' autocomplete is very inefficient compared to Makefile's. If you are used to `mak<TAB>t<TAB>` to run the test suite, then spelling out `npm run test` seems to take forever. You can try tabbing your way through it, but the computer will cowardly throw false positives at you through `npm` and take ages to enumerate `run` and `test` for you, as it boots npm and all of its dependencies with every keystroke.

If you're often working with tasks and you know the computer could have understood you seconds ago, but you're still typing - that starts to become a little annoying.

Okay, that's an understatement. It feels like you're a butcher who can only use a spoon. I guess it ultimately gets the job done, but at the expense of reduced productivity and satisfaction.

## The (Slightly Ugly) Killer Solution

Introducing Fakefile: A Universal Makefile for JavaScript that proxies to your npm scripts.

Type `mak<TAB><TAB>` for autocompletion and Fakefile quickly enumerates any npm scripts in your `package.json` and presents these as ways into the project. It's instant. Autocomplete to `make test` and the command is passed onto `npm run test` for the lower-level plumbing. Any npm script you have will automatically be available under `make` at runtime, so it won't need any maintenance as your project changes its npm scripts.

Makefiles can't handle `:` characters well, so it will offer `npm run build:production` to you as `make build-production`.

This gets us the best of both worlds. Codify your tasks in a system that won't be obsolete within the year, that is straightforward to people on Windows (they can ignore the Makefile and use `npm run`) and welcoming unix folks to use their trusted fillet blade. In any repo I maintain, no matter the language, `make <something>` gets me what I want without thinking twice.

So how does the Fakefile look? This is the gist of it:

```bash
# Copyright (2016) by Kevin van Zonneveld https://twitter.com/kvz
# Licensed under MIT
define npm_script_targets
TARGETS := $(shell node -e 'for (var k in require("./package.json").scripts) {console.log(k.replace(/:/g, "-"));}')
$$(TARGETS):
	npm run $(subst -,:,$(MAKECMDGOALS))

.PHONY: $$(TARGETS)
endef

$(eval $(call npm_script_targets))
```

An up to date version can be found [here](https://github.com/kvz/fakefile/blob/master/Makefile). Save this snippet into your project root as a `Makefile` and that's it!

> Hm. Can't we make that any easier? And support upgrades and version pinning and all that jazz?

Sure, installing a **Fakefile** can also be done via npm:

```bash
npm install --save --exact fakefile
```

This will save a Makefile into your project root. 

If the installer detects a Makefile that it does not recognize by its SHA-1 hash, it will warn you instead of overwriting it. This gives you a chance to port any existing Makefile logic to npm scripts, after which you can safely remove your original Makefile and rerun the installation, this time successfully installing Fakefile. The installer is happy to overwrite known SHA-1s, so we can easily upgrade Fakefile, should the need arise.

We're currently using npm scripts with a Fakefile in front of it for three different projects and we couldn't be happier with the results. Give it a try and let me know.
]]></content:encoded>
      <dc:date>2016-02-18T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Introducing Airbud</title>
      <link>https://kvz.io/introducing-airbud.html</link>
      <description><![CDATA[Retrieving stuff from the web is unreliable. Airbud adds retries for production, and fixture support for test.
]]></description>
      <pubDate>Mon, 21 Sep 2015 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/introducing-airbud.html</guid>
      <content:encoded><![CDATA[Retrieving stuff from the web is unreliable. Airbud adds retries for production, and fixture support for test.

Airbud is a wrapper around [request](https://www.npmjs.org/package/request) with support for for handling JSON, retries with exponential backoff &amp; injecting fixtures. This will save you some boilerplate and allow you to easier test your applications.

## Install

Inside your project, type

```bash
npm install --save airbud
```

## Use

To use Airbud, first require it

In JavaScript

```javascript
var Airbud = require('airbud');
```

Or CoffeeScript:

```coffeescript
Airbud = require "airbud"
```

Airbud doesn't care.

### Example: simple

A common usecase is getting remote JSON. By default `Airbud.json` will already:

  - Timeout each single operation in 30 seconds
  - [Retry 5 times over 10 minutes](https://www.wolframalpha.com/input/?i=Sum%5Bx%5Ek+*+5%2C+%7Bk%2C+0%2C+4%7D%5D+%3D+10+*+60+%26%26+x+%3E+0)
  - Return parsed JSON
  - Return `err` if
    - A non-2xx HTTP code is returned (3xx redirects are followed first)
    - The json could not be parsed

In CoffeeScript:

```coffeescript
Airbud.json "https://api.github.com/events", (err, events, meta) ->
  if err
    throw err
  console.log events[0].created_at
```

### Example: local JSON fixtures

Say you're writing an app that among things, retrieves public events from the GitHub API.

Using [environment variables](https://github.com/kvz/environmental), your production environment will have a `GITHUB_EVENTS_ENDPOINT` of `"https://api.github.com/events"`, but when you `source envs/test.sh`, it can be `"file://./fixtures/github-events.json"`.

Now just let `Airbud.retrieve` the `process.env.GITHUB_EVENTS_ENDPOINT`, and it will either retrieve the fixture, or the real thing, depending which environment you are in.

This makes it easy to test your app's depending functions, without having to worry about GitHub ratelimiting, downtime, or sloth when running your tests. All of this without making your app aware, or changing it's flow. In JavaScript:

```javascript
var opts   = {
  url: process.env.GITHUB_EVENTS_ENDPOINT,
};

Airbud.json(opts, function (err, events, meta) {
  if (err) {
    throw err;
  }

  console.log('Number of attempts: '+ meta.attempts);
  console.log('Time it took to complete all attempts: ' + meta.totalDuration);
  console.log('Some auto-parsed JSON: ' + events[0].created_at);
});
```

### Example: customize

You don't have to use environment vars or the local fixture feature. You can also use Airbud as a wrapper around request to profit from retries with exponential backoffs. Here's how to customize the retry flow in CoffeeScript:

```coffeescript
opts =
  retries         : 3
  randomize       : true
  factor          : 3
  minInterval     : 3  * 1000
  maxInterval     : 30 * 1000
  operationTimeout: 10 * 1000
  expectedStatus  : /^[2345]\d{2}$/
  expectedKey     : "status"
  url             : "https://api.github.com/events"

Airbud.retrieve opts, (err, events, meta) ->
  if err
    throw err

  console.log events
```


### Example: 3 retries in one minute, retry after 3s timeout for each operation

```coffeescript
opts =
  url             : "https://api2.transloadit.com/instances"
  retries         : 2
  factor          : 1.73414
  expectedKey     : "instances"
  operationTimeout: 3000
```

Some other tricks up Airbud's sleeves are `expectedKey` and `expectedStatus`, to make it error out when you get invalid data, without you writing all the extra `if` and maybes.


## Options

Here are all of Airbud's options and their default values.

```coffeescript
# Timeout of a single operation
operationTimeout: 30000

# Retry 5 times over 10 minutes
# https://www.wolframalpha.com/input/?i=Sum%5Bx%5Ek+*+5%2C+%7Bk%2C+0%2C+4%7D%5D+%3D+10+*+60+%26%26+x+%3E+0
# The maximum amount of times to retry the operation
retries: 4

# The exponential factor to use
factor: 2.99294

# The number of milliseconds before starting the first retry
minInterval: 5 * 1000

# The maximum number of milliseconds between two retries
maxInterval: Infinity

# Randomizes the intervals by multiplying with a factor between 1 to 2
randomize: true

# Automatically parse json
parseJson: null

# A key to find in the rootlevel of the parsed json.
# If not found, Airbud will error out
expectedKey: null

# An array of allowed HTTP Status codes. If specified,
# Airbud will error out if the actual status doesn't match.
# 30x redirect codes are followed automatically.
expectedStatus: "20x"

# Custom headers to submit in the request
headers: []
```

## Meta

Besides, `err`, `data`, Airbud returns a third argument `meta`. It contains some meta data about the operation(s) for your convenience.

```coffeescript
# The HTTP status code returned
statusCode
# An array of all errors that occured
errors
# Number of attempts before Airbud was able to retrieve, or gave up
attempts
# Total duration of all attempts
totalDuration
# Average duration of a single attempt
operationDuration
```
]]></content:encoded>
      <dc:date>2015-09-21T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Watch Your Language (Automatically)</title>
      <link>https://kvz.io/watch-your-language.html</link>
      <description><![CDATA[I was writing internal documentation on how I set up automated language checking at Transloadit. Halfway through, I thought this could be useful to the rest of the world :earth_americas: as well, so I rewrote it in a more generic fashion. I'll attempt to first give a high-level overview of the problem, then I will drive all the way down to the low-level nuts &amp; bolts of solving it. I hope you'll enjoy, here goes!
]]></description>
      <pubDate>Wed, 16 Sep 2015 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/watch-your-language.html</guid>
      <content:encoded><![CDATA[I was writing internal documentation on how I set up automated language checking at Transloadit. Halfway through, I thought this could be useful to the rest of the world :earth_americas: as well, so I rewrote it in a more generic fashion. I'll attempt to first give a high-level overview of the problem, then I will drive all the way down to the low-level nuts & bolts of solving it. I hope you'll enjoy, here goes!

At Transloadit we've been extracting all significantly sized chunks of text (documentation, blog posts, static pages) into a separate content repository.

Up until this migration, our text was scattered across MySQL tables, templates, and
HTML files. A big soup of content, layout, code, locations. Developers were able to access - but without much joy. Non-developers didn't stand a chance.

We thought it would be interesting to see if we could attract technical writers and give them
full access to our content. We pictured they could use the GitHub web interface in wiki-like fashion so they could improve our language without being distracted by code, accidentally changing it, or needing much skill in that area.

This hasn't reached its full potential yet, but:

- as developers we're already enjoying working on the (purely Markdown) content
- having all content in a separate repository opens doors to other cool possibilities, like automated quality control, or: Continuous Integration

## Continuous Integration

Continuous Integration is a concept normally associated with code. Here's how [ThoughtWorks](https://www.thoughtworks.com/continuous-integration) explained it:

> Continuous Integration (CI) is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early.

At Transloadit we're already using this for all our code. But could we also use this for our English?

A question that was extra relevant for us, because while the majority
of our customers lives in the United States,
Transloadit is Berlin-based, and nobody on our current [team](/about#team)
is a native English speaker.

Considering how damaging language errors can be when people are still in the early stages of evaluating a product - makes it all the more important for us to have some extra checks in place.

I'm attacking poor content quality in three areas:

1. [Inconsiderate Writing](#inconsiderate-writing)
1. [Messy Formatting](#messy-formatting)
1. [Spelling Errors](#spelling-errors)

## Inconsiderate Writing

To quote [npm weekly](https://blog.npmjs.org/post/128823264455/npm-weekly-28-how-to-do-many-things),

> Odds are none of us intends to exclude or hurt fellow members of the community, but polarizing and gender-favoring language has a way of slipping into what we write. Sometimes it’s a big help to have a second set of eyes that can look things over, notice what we’ve overlooked, and nudge us towards being more considerate and inclusive.
> Alex helps "catch insensitive, inconsiderate writing" by identifying possibly offensive language and suggesting helpful alternatives.

To install [alex](https://alexjs.com/) we run a simple `npm install --save alex`. According to its [author](https://github.com/wooorm/alex/issues/33),

> Alex isn’t as smart as a human, but it tries its best and is sometimes overly happy to let you know something may be insensitive.

This means there may occasionally be false positives and we don't want
alex' warnings to be fatal, so we're using

```bash
node_modules/.bin/alex || true
```

We'll get to see language that alex thinks could be improved, but we won't make those
suggestions critical.

For example

```bash
  74:74-74:76    warning  `he` may be insensitive, use `they`, `it` instead
```

We're currently rewriting our docs to be more inclusive thanks to this project.

## Messy Formatting

Our text files are in the Markdown format. This format was chosen, because

- it's fairly easy to digest for humans and computers,
- has a great ecosystem of tools around it, and
- offers a good separation between a document's structure, and its layout. We can specify that something is emphasized, but not Comic Sans. Those decisions are left to the designers.

Often there are multiple ways to achieve the same goal in Markdown. As with code,
it helps to settle on a convention, and force every contributor to follow that. By taking away some of this (useless) artistic freedom, the resulting document looks well maintained, and invites further contribution.

For this we're using [mdast](https://github.com/wooorm/mdast#readme) with a lint plugin: `npm install --save mdast mdast-lint`

Since we don't want to check external projects (like mdast itself) or re-check built artifacts, we're excluding a few locations

```bash
'_site/' >> .mdastignore
'node_modules/' >> .mdastignore
```

We then saved the following convention in `.mdastrc`, but this is of course dependent on your [settings](https://github.com/wooorm/mdast#mdastprocessvalue-options-done) and [lint preferences](https://github.com/wooorm/mdast-lint/blob/master/doc/rules.md)

```javascript
{
  "plugins": {
    "lint": {
        "blockquote-indentation": 2,
        "emphasis-marker": "*",
        "first-heading-level": false,
        "link-title-style": "\"",
        "list-item-indent": false,
        "list-item-spacing": false,
        "no-shell-dollars": false,
        "maximum-heading-length": false,
        "maximum-line-length": false,
        "no-duplicate-headings": false,
        "no-blockquote-without-caret": false,
        "no-file-name-irregular-characters": true,
        "no-file-name-outer-dashes": false,
        "no-heading-punctuation": false,
        "no-html": false,
        "no-multiple-toplevel-headings": false,
        "ordered-list-marker-style": ".",
        "ordered-list-marker-value": "one",
        "strong-marker": "*"
    }
  },
  "settings": {
    "gfm": true,
    "yaml": true,
    "rule": "-",
    "ruleSpaces": false,
    "ruleRepetition": 70,
    "emphasis": "*",
    "listItemIndent": "1",
    "incrementListMarker": false,
    "spacedTable": false
  }
}
```

Then we lint for the first time

```bash
node_modules/.bin/mdast --frail .
```

This may return

```bash
_posts/2015-09-15-spelling.md
  246:1      warning  Use spaces instead of hard-tabs         no-tabs
```

As a bonus, mdast can even attempt to repair this automatically

```bash
node_modules/.bin/mdast --output .
```

We were impressed by how much mdast was able to fix. Make sure your files are committed to Git before running this command, though. You'll want to review the changes made, and revert them if needed. You'll need a few iterations to get this to a good place.

## Spelling Errors

William Dutton, director of the Oxford Internet Institute at Oxford University, says in [Spelling mistakes 'cost millions' in lost online sales](https://www.bbc.com/news/education-14130854) that in some informal parts of the internet, such as Facebook, there is greater tolerance towards spelling and grammar.

> However, there are other aspects, such as a home page or commercial offering that are not among friends and which raise concerns over trust and credibility.
> In these instances, a misspelt word could be a killer issue.

You had me at 'concerns'. Let's get to work. For checking spelling in Markdown documents we're using `npm install --save markdown-spellcheck`.

It may not catch grammar and many other subtleties ("it's" vs "its"), but at least many of my unfortunate stubborn mistakes like these get caught before reaching production:

- my own fantasy English ("symbiose" vs "symbiosis")
- stubborn misfires ("editted" vs "edited"), and
- mixing British with US English ("summarise" vs "summarize")

(in my defense: I'm not a native English speaker :smile:)

What's cool is that markdown-spellcheck will automatically skip code blocks and other Markdowny things - but obviously we still had to ignore things like `Transloadit` & `FFmpeg`

```bash
'Transloadit' >> .spelling
'FFmpeg' >> .spelling
```

We're now ready to check our Markdown files for spelling mistakes

```bash
node_modules/.bin/mdspell \
  --report \
  --en-us \
  --ignore-numbers \
  --ignore-acronyms \
  **/*.md \
  _layouts/*.html \
  _includes/*.html \
  *.html
```

This might return that "editted" is not a word.

### First Run

There's a good chance the first run uncovers many issues, both with your
documents, as with the dictionary. It's a good idea to run `mdspell` without the `--report` flag so it will enter the default interactive mode.

<div class="kodak-container kodak-dropshadow">
 <img src="/assets/images/posts/2015-09-16-watch-your-language-0.png">
</div>

This will allow you to exclude certain files and build a personalized dictionary inside `.spelling`. Something that will take a while and is a good job if you want to be productive on an otherwise uninspired afternoon.

As you add new content you'll sometimes have to add words to the whitelist as well. But at least you'll know that all cases were words stray from the dictionary, will be deliberate. And that's a good feeling.

## Combine

Now let's put this all together.

Since these are all tools we installed from npm it might make sense to use
[npm run scripts](https://blog.npmjs.org/post/118810260230/building-a-simple-command-line-tool-with-npm), but in our case I chose a [Makefile](https://gist.github.com/isaacs/62a2d1825d04437c6f08) simply because we like TABing through shell autocompletion and so that we can have the same developer entry point in all of our projects, whether written in Node.js, Bash, or Go.

```bash
SHELL       := /usr/bin/env bash
tstArgs     :=
tstPattern  := **/*.md _layouts/*.html _includes/*.html *.html

.PHONY: fix-markdown
fix-markdown:
  @echo "--> Fixing Messy Formatting.."
  @node_modules/.bin/mdast --output .

.PHONY: test-inconsiderate
test-inconsiderate:
  @echo "--> Searching for Inconsiderate Writing (non-fatal).."
  @node_modules/.bin/alex $(tstArgs) || true

.PHONY: test-spelling
test-spelling:
  @$(MAKE) test-spelling-interactive tstArgs=--report

.PHONY: test-spelling-interactive
test-spelling-interactive:
  @echo "--> Searching for Spelling Errors.."
  @node_modules/.bin/mdspell \
  $(tstArgs) \
    --en-us \
    --ignore-numbers \
    --ignore-acronyms \
    $(tstPattern)

.PHONY: test-markdown-lint
test-markdown-lint:
  @echo "--> Searching for Messy Formatting.."
  @node_modules/.bin/mdast --frail $(tstArgs) .

.PHONY: test
test: test-inconsiderate test-spelling test-markdown-lint
  @echo "All okay : )"
```

Now we can run `make test` to see if all our checks pass.

We can run `make test-spelling` to only zoom in on spelling mistakes, or
`make test-spelling-interactive` if we want to enter interactive mode after
writing content with a lot of new words unlikely to be in the dictionary already.

If you have [Bash Completion](https://blog.jeffterrace.com/2012/09/bash-completion-for-mac-os-x.html), just type `make`, press TAB, and see all the available shortcuts.

## Automate

To automate testing, we'll require a Continuous Integration server.

Travis CI, Strider, Drone.io, all fit the bill. As long as we have a central place that will execute code in a reliable and repeatable fashion whenever a change to your repository is made.

We're using [Jenkins](https://jenkins-ci.org/) for private projects, and I created 3 new chained jobs for our content repository:

- `content-build` turns our Markdown into static HTML content via Jekyll, then triggers:
- `content-test` runs all the commands in this post, then triggers:
- `content-inject` stores the HTML into our website, then triggers: `website-build`, `website-test`, `website-deploy`. A chain we had already set up to deploy our website.

<div class="kodak-container kodak-dropshadow">
 <img src="/assets/images/posts/2015-09-16-watch-your-language-1.png">
</div>

So now new content can only be injected and deployed if all checks pass.
It's a pretty long chain but luckily a machine takes care of that :smile:

And when that machine detects typos in new content, we have a Slack integration set up so we get notified immediately.

<div class="kodak-container kodak-dropshadow">
 <img src="https://transloadit.com/assets/images/blog/2015-09-slack.png">
</div>

## So Is This Perfect Now?

No. Humans are fallible and so are their machines and dictionaries.

I'll need to keep tweaking `.spelling`, and "it" needs to keep correcting me. But via this automated quality control for language, we keep each other in check, and have less errors than before.

In the case of Transloadit, we were able to fix 151 mistakes in our first run

<div class="kodak-container kodak-dropshadow">
 <img src="https://transloadit.com/assets/images/blog/2015-09-github.png">
</div>

Yes.. it turns out we <s>are</s> were really bad spellers!
]]></content:encoded>
      <dc:date>2015-09-16T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>tus 1.0-prerelease</title>
      <link>https://kvz.io/tus-1.0-prerelease.html</link>
      <description><![CDATA[It's been a while since I've mentioned tus, but there have been some cool developments so we'd like to refresh your memory.
]]></description>
      <pubDate>Tue, 10 Feb 2015 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/tus-1.0-prerelease.html</guid>
      <content:encoded><![CDATA[It's been a while since I've mentioned [tus](https://tus.io), but there have been some cool developments so we'd like to refresh your memory.

## So Yeah, Uploading is Broken

At Transloadit we care about providing a reliable upload service. Sometimes however, it could happen that our servers misbehave, that a mobile user switches connections, or that your end user has flaky WiFi.

We're doing what we can to limit cases like these, but also have to face the brisk reality that sometimes, an upload needs to be retried from the start.

A bad user experience, especially if in the middle of a 2 GB upload on a poor connection. What's worse, the longer an upload takes, the longer it is exposed to this poor connection.

With media getting bigger and networks staying fragile, we need a better way to handle these outages and hiccups.

## Resumable Uploads

Any decent network library implements retries, but ideally, a retry picks up where the user left off, and starts uploading only the remaining bytes when the end user for instance clears a tunnel.

If that happens under the hood, the user might not even notice he had an interrupted connection as the total upload time is barely impacted.

What's more, if we could redesign this in such a way that we could send multiple file parts simultaneously, the upload might even become faster.

For all of this to work, we need resumable uploads.

There are many pieces of software / implementations that already offer resumable uploads, but they're not as thorough or interoperable as we'd like. They all speak a different dialect.

Imagine every web browser having it's own incompatible HTTP dialect. It's all just not very *open web* like.

> A thousand incompatible one week projects that barely work, when all we need is one real project, done right.

And that's what tus is.

## tus

tus is an open protocol that's been [in development](https://github.com/tus/tus-resumable-upload-protocol) for over a year, with high-level contributions from all over the world and from different companies besides Transloadit - such as [Vimeo](https://github.com/tus/tus-resumable-upload-protocol/issues/33#issuecomment-46794724).

Here's how a simple back & forth between a client and server that can speak tus 1.0 would look like:

```bash
# Client:
> POST /files HTTP/1.1
> Host: tus.example.org
> Content-Length: 0
> Entity-Length: 100
> TUS-Resumable: 1.0.0
> Metadata: filename d29ybGRfZG9taW5hdGlvbl9wbGFuLnBkZg==

# Server:
< HTTP/1.1 201 Created
< Location: https://tus.example.org/files/24e533e02ec3bc40c387f1a0e460e216
< TUS-Resumable: 1.0.0

# Client:
> PATCH /files/24e533e02ec3bc40c387f1a0e460e216 HTTP/1.1
> Host: tus.example.org
> Content-Type: application/offset+octet-stream
> Content-Length: 30
> Offset: 0
> TUS-Resumable: 1.0.0
>
> [first 30 bytes]

# Server:
< HTTP/1.1 204 No Content
< TUS-Resumable: 1.0.0
```

Besides a lightweight but well documented protocol, tus also provides two example implementations:

- [Client for jQuery](https://github.com/tus/tus-jquery-client/)
- [Server in Go](https://github.com/tus/tusd/tree/neXT)

Additionally there are many community provided [implementations](https://tus.io/implementations.html) in languages like
[iOS](https://github.com/tus/TUSKit),
[Android](https://github.com/tus/tus-android-client),
[Ruby](https://github.com/picocandy/rubytus),
[Node.js](https://github.com/vayam/brewtus),
[Python](https://github.com/vayam/tuspy),
[PHP](https://github.com/leblanc-simon/php-tus),
etc. Since we're building on top of *plain HTTP*, it's straightforward to build implementations in your favorite language. If you fully implement the protocol and license it MIT, we might adopt your project.

## Nearing 1.0 - Last Call for Feedback

The news today is that we're really [close to releasing](https://tus.io/blog/2015/02/03/protocol-v1.0.0-prerelease/) a 1.0 stable version of the protocol and our reference implementations, and are calling for one last round of feedback over at our [1.0 pull request](https://github.com/tus/tus-resumable-upload-protocol/pull/57) on GitHub.

Please don't be shy and leave your comments and recommendations if you still want to impact tus 1.0.

Finally, we encourage everybody who deals with uploads (yes, competitors too) to have have a look at tus and consider using it for your next release. tus is Transloadit started and funded, but community owned, and this will not change.

At [Transloadit](https://transloadit.com) we'll be looking to implement tus in our upload & encoding service this year.
]]></content:encoded>
      <dc:date>2015-02-10T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Introducing Ratestate</title>
      <link>https://kvz.io/introducing-ratestate.html</link>
      <description><![CDATA[Ratestate is a ratelimiter in the form of a Node.js module that can transmit states of different entities while avoiding transmitting the same state twice, and adhering to a global speed limit.
]]></description>
      <pubDate>Sun, 21 Dec 2014 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/introducing-ratestate.html</guid>
      <content:encoded><![CDATA[Ratestate is a ratelimiter in the form of a [Node.js module](https://npmjs.org/package/ratestate) that can transmit states of different entities while avoiding transmitting the same state twice, and adhering to a global speed limit.

Let's say you purchased some intelligent lightbulbs and want to set new colors in near-realtime (e.g. based on color detection of camera input), however the central hub receiving the color commands has a rate limiter that only accepts 30 updates per second. Ratestate can help you spread & drip updates amongst the different lightbulbs, without forming queues (by forgetting about superseded colors).

## Install

```bash
npm install --save ratestate
```

## Use

Here's a little CoffeeScript example

```coffeescript
ratestate = new Ratestate
  interval: 30
  worker  : (id, state, cb) ->
    # Transmit the state to id
    cb null

ratestate.start()
ratestate.setState 1, color: "purple"
ratestate.setState 1, color: "green"
ratestate.setState 1, color: "yellow"
ratestate.setState 1, color: "yellow"
ratestate.setState 1, color: "yellow"
ratestate.setState 1, color: "green"
ratestate.stop()
```

In this example, entity `1` will reach `"green"` and probably won't be set to any other intermediate state (color in this case), as we're setting the state much faster than our configured `interval` could keep up with.

## Behavior and Limitations

Ratestate is similar to Underscore's [debounce](https://underscorejs.org/#debounce), but it runs indefintely and assumes you want to update the state of different entities, but for all entities you are globally speed limited. For instance you might want to

 - Continously update 20 different `.json` files on S3, but your server/network only allows a few updates per second. The part of the program that sets the updates, should fire & forget, and not concern itself with environmental constraints like that.
 - Flush the current status of visitors to disk for caching, but throttle the total throughput as to not wear out your harddisk or cause high load.
 - Capture dominant colors from a video feed at 60 frames per second, and push those colors to Philips HUE lamps, but the combined throughput to them is capped by a rate-limiter on the central Bridge, allowing you to only pass through 30 colors per second total.

You can call `setState` as much as you'd like, and Ratestate will

 - Only transmit at a maximum speed every configured `interval`ms
 - Take care of an even spread between the entities
 - Not execute `worker` if the state has not changed
 - Consider the last pushed state for an entity leading, it will **not** attempt to transmit **every** state if more states are set than can be transmitted
 - Avoid concurrently working on the same entity (last write wins)

## Hashing

By default, Ratestate detects if a state has changed by comparing hashes of set `state` objects and it won't consider executing the `worker` on entity states that have not changed.

If this built-in serializing & hashing is too heavy for your usecase (your states are huge - your interval low), you can supply your own function that will be executed on the `state` object to determine its uniqueness. In the following example we'll supply our own `hashFunc` to determine if the state is a candidate for passing to the `worker`.

```coffeescript
megabyte = 1024 * 1024 * 1024
status   =
  id            : "foo-id"
  status        : "UPLOADING"
  bytes_received: 2073741824
  client_agent  : "Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0"
  client_ip     : "123.123.123.123"
  uploads       : [
    name: "tesla.jpg"
  ]
  results: [
    original:
      name: "tesla.jpg"
  ,
    resized:
      name: "tesla-100px.jpg"
  ]

ratestate = new Ratestate
  hashFunc: (state) ->
    return [
      state.status
      state.bytes_received - (state.bytes_received % megabyte)
      state.uploads.length
      state.results.length
    ].join "-"

ratestate.start()
ratestate.setState "foo-id", status
ratestate.stop()
```

This would internally be 'hashed' as `UPLOADING-653908770816-1-2`, if we detect a change in our system and blindly call `setState` for our entity, this only executes the `worker` on it if

 - The `status` has changed, OR
 - We have more than a new megabytes worth of `bytes_received`, OR
 - The amount of `uploads` changed, OR
 - The amount of `results` changed

As that covers all the interesting changes for us, it's more efficient than serializing and hashing an entire object.

## finalState

`finalState` is much like `setState` (it's called under the hood), but requires a callback, which is called after the `worker` successfully finished on it. Additionally, all data of the involved entity are removed from ratestate.
]]></content:encoded>
      <dc:date>2014-12-21T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Introducing Environmental</title>
      <link>https://kvz.io/introducing-environmental.html</link>
      <description><![CDATA[Some people feel that shipping .json / .yml / .xml config files is an upgrade over using archaic environment variables.
]]></description>
      <pubDate>Sat, 17 May 2014 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/introducing-environmental.html</guid>
      <content:encoded><![CDATA[Some people feel that shipping `.json` / `.yml` / `.xml` config files is an upgrade over using archaic environment variables.

![687474703a2f2f6769666174726f6e2e636f6d2f77702d636f6e74656e742f75706c6f6164732f323031332f30322f6974735f615f747261702e676966](/assets/images/posts/2014-05-17-introducing-environmental-0.gif)

Don't let your app load its config, inject it instead.

Unix environment vars are ideal for configuration and I have yet to encounter an application that wouldn't be better off with them. Why?

- You can override a value at near-runtime without having to change/backup config files: `DEBUG=*.* node run.js`
- You can inject environment variables (passwords, API keys) into the memory of a process belonging to a non-privileged user: `source envs/production.sh && sudo -EHu www-data node run.js` without having to run / write any software for it.
- You can inherit. Inside `staging.sh`, just `source production.sh`, inside `kevin.sh` `source development.sh`
- Your operating system is aware and provides tools to inspect, debug, optionally pass on to other processes, etc.
- You can directly use config across languages, e.g. in supporting BASH scripts
- You can directly use the config in a terminal yourself, e.g. `cd ${MYAPP_DIR}`

And as with any other type of config:

- You can group/save them into files and keep them out of version control

One downside of environment variables is that there is little convention and syntactic sugar in the high-level languages. It doesn't feel atomic and you think it's more likely to let you down. This module attempts to change that.

Environmental doesn't:

 - Break [12-factor](https://12factor.net/)
 - Get in your way

Environmental does:

 - Impose **one way** of dealing with environment variables
 - Make vars available in nested format inside your app (e.g. `MYAPP_REDIS_HOST`) becomes `config.redis.host`
 - <3 unix
 - Interpret multiple inherited bash environment files in an isolated environment to capture them, and prepare them for exporting to [Nodejitsu](https://www.nodejitsu.com/documentation/jitsu/env/) or [Heroku](https://devcenter.heroku.com/articles/config-vars).

## Conventions

### Layout

Environmental tree:

```bash
_default.sh
├── development.sh
│   └── test.sh
└── production.sh
    └── staging.sh.sh
```

On disk:

```bash
envs/
├── _default.sh
├── development.sh
├── production.sh
├── staging.sh
└── test.sh
```

You could make this super-[DRY](https://en.wikipedia.org/wiki/Don't_repeat_yourself), but I actually recommend using mainly
`development.sh` and `production.sh`, and duplicate keys between them
so you can easily compare side by side.
Then just use `_default.sh`, `test.sh`, `staging.sh` for tweaks, to keep things
clear. Read 'Inheritance can be a bitch' to see why.

### Inheritance can be a bitch

One common pitfall is re-use of variables:

```bash
export MYSQL_HOST="127.0.0.1"
export MYSQL_URL="mysql://user:pass@${MYSQL_HOST}/dbname"
```

Then when you extend this and only override `MYSQL_HOST`, obviously the `MYSQL_URL` will remain unaware of your host change. Ergo: duplication of vars might be the lesser evil here compared to going out of your way to DRY things up.

### Inject features

Instead of having your code make decisions based on environment:

```coffeescript
if process.env.NODE_ENV == "production"
  # Install cronjobs
```

Keep that responsibility with your environment files:

```bash
$ cat envs/_default_.sh
TLS_CRONJOBS_INSTALL="0"

$ cat envs/production.sh
TLS_CRONJOBS_INSTALL="1"
```

```coffeescript
if config.cronjobs.install == "1"
  # Install cronjobs
```

### Mandatory and unprefixed variables

These variables are mandatory and have special meaning. There is no syntactic sugar for them, you are to access them via `process.env.<var>`:

```bash
export NODE_APP_PREFIX="MYAPP" # filter and nest vars starting with MYAPP right into your app
export NODE_ENV="production"   # the environment your program thinks it's running
export DEPLOY_ENV="staging"    # the machine you are actually running on
export DEBUG=*.*               # used to control debug levels per module
```

After getting that out of the way, feel free to start hacking, prefixing all
other vars with `MYAPP` - or the actual short abbreviation of your app name. Don't use an underscore `_` in this name.

In this example, `TLS` is our app name:

```bash
export NODE_APP_PREFIX="TLS"
export TLS_REDIS_HOST="127.0.0.1"
export TLS_REDIS_USER="jane"
```

## Getting started

In a new project, type

```bash
$ npm install --save environmental
```

This will install the node module. Next you'll want to set up an example environment as shown in layout, using these templates:

```bash
cp -Ra node_modules/environmental/envs ./envs
```

Add `envs/*.sh` to your project's `.gitignore` file so they are not accidentally committed into your repository.  
Having env files in Git can be convenient as you're still protoyping, but once you go live you'll want to change all credentials and sync your env files separately from your code.

## Accessing config inside your app

Start your app in any of these ways:

```bash
source envs/development.sh
node myapp.js
```

```bash
source envs/production.sh
DEBUG=*.* node myapp.js
```

```bash
source envs/staging.sh
# Following seems weird, but sudo will not preserve $PATH, regardless of -E
sudo -EHu www-data env PATH=${PATH} node myapp.js
```

```bash
source envs/development.sh && node myapp.js
```

```bash
start myapp # see upstart example below
```

Inside your app you can now obviously already just access `process.env.MYAPP_REDIS_HOST`, but **Environmental** also provides some syntactic sugar so you could type `config.redis.host` instead. Here's how:

```javascript
var config = require('environmental').config();
console.log(config);

// This will return
//
//   { redis: { host: '127.0.0.1' } }
```

Or in coffeescript if that's your cup of tea:

```coffeescript
config      = require("environmental").config()
redisClient = redis.createClient(config.redis.port, config.redis.host)
```

As you see

 - any underscore `_` in env var names signifies a new nesting level of configuration
 - all remaining keys are lowercased

`config` takes two arguments: `flat` defaulting to `process.env`, and `filter`, defaulting to `process.env.NODE_APP_PREFIX`. Changing these allow you to inject or reload environment variables.

## Capturing a specific config file

By default, environmental code will capture environment variables of the current process. However, you can also use it to capture variables in isolation as produced by a gives shell file (works with inherits too).

```coffeescript
env = new Environmental
env.capture "#{__dirname}/envs/production.sh", (err, flat) ->
  expect(err).to.be.null
  expect(flat.MYAPP_REDIS_HOST, "127.0.0.1")
```

Notice in does not nest the configuration. You can do that by using the `config` method and passing it flat environment vars:

```coffeescript
config = Environmental.config flat, "MYAPP"

expect(config).to.deep.equal
  redis:
    host: "127.0.0.1"
```

## Exporting to Nodejitsu

Nodejitsu also works with environment variables. But since those are hard to ship, they want you to bundle them in a json file.

Environmental can create such a temporary json file for you from the command-line. In this example it figures out all vars from `envs/production.sh` (even if it inherits from other files):

```bash
./node_modules/.bin/environmental --file=envs/production.sh --format=json > /tmp/jitsu-env.json
jitsu --confirm env load /tmp/jitsu-env.json
jitsu --confirm deploy
rm /tmp/jitsu-env.json
```

## Exporting to Heroku

```bash
heroku config:set $(./node_modules/.bin/environmental --file=envs/production.sh --format=space)
```

## Exporting to your own servers

To generate a single file that your server can source:

```bash
./node_modules/.bin/environmental --file=envs/production.sh --format=newline
```

Note that this is different from:

```bash
source envs/production.sh && env
```

As the output is cleansed from any environment variable that was not declared in `env/production.sh` or one of it's ancestors.

You could use this list to inject into a process upon (re)starts, or save as a file so upstart can inject it into a non-privileged process, and use e.g. rsync to distribute it amongst privileged users:

```bash
for host in `echo ${MYAPP_SSH_HOSTS}`; do
  rsync \
   --recursive \
   --links \
   --perms \
   --times \
   --devices \
   --specials \
   --progress \
  ./envs/ ${host}:${MYAPP_DIR}/envs
done
```

## Injecting into a non-privileged user process

When you deploy your app into production and you run the servers yourself, you might want to use upstart to respawn your process after crashes.

Here's how an [upstart](https://upstart.ubuntu.com/) file (`/etc/init/myapp`) could look like, where the root user injects the environment keys into process memory of an unpriviliged user.

This has the big security advantage that you own program cannot even read its credentials from disk.

```bash
stop on runlevel [016]
start on (started networking)

respawn
respawn limit 10 5

limit nofile 32768 32768

pre-stop exec status myapp | grep -q "stop/waiting" && initctl emit --no-wait stopped JOB=myapp || true

script
  exec bash -c "cd /srv/myapp/current \
    && chown root envs/*.sh \
    && chmod 600 envs/*.sh \
    && source envs/production.sh \
    && exec sudo -EHu www-data make start 2>&1"
end script
```
]]></content:encoded>
      <dc:date>2014-05-17T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Git Hour Tracking</title>
      <link>https://kvz.io/git-hour-tracking.html</link>
      <description><![CDATA[Recently I was asked to estimate how many hours I worked on a project. Since I hadn't really tracked them I decided to use the Git history to get an indication.
]]></description>
      <pubDate>Thu, 01 May 2014 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/git-hour-tracking.html</guid>
      <content:encoded><![CDATA[Recently I was asked to estimate how many hours I worked on a project. Since I hadn't really tracked them I decided to use the Git history to get an indication.

<!--more-->

It turned out it was trickier than I thought to wrestle `git log` into producing a list that was somewhat useful, but here is a screen of what i ended up with:

![screen shot 2014-05-01 at 15 53 25](/assets/images/posts/2014-05-01-git-hour-tracking-0.png)

This list makes it pretty easy to filter by author with `|grep kevin`, to only show what days I worked and what times I left (`|sort -uk1,1`) and how early I started (`|tail -r |sort -uk1,1`), or copy to a spreadsheet editor for further processing.

Here is the Git log command I used to produce the list:

```bash
git --no-pager log \
  --date=iso \
  --since="2 months" \
  --date-order \
  --full-history \
  --all \
  --pretty=tformat:"%C(cyan)%ad%x08%x08%x08%x08%x08%x08%x08%x08%x08 %C(bold red)%h %C(bold blue)%<(22)%ae %C(reset)%s"
```

> Since we can't use a custom dateformat but really want to, the hack is to use ISO format (which comes closest to what we want), and add backspace characters (`%x08`). I [stole that from stackoverflow](https://stackoverflow.com/a/16735971/151666).

I saved this to [a file](https://gist.githubusercontent.com/kvz/bbb61b61e4ffab48e7f6/raw/8552dd34e922e6a18022c585ffb0aebceac526d0/git-timetracker.sh) in my `$PATH` so now in any repository I could type:

```bash
git-timetracker.sh 2 years
```

I also tried making a Git alias in my `~/.gitconfig` (not just calling the `.sh` script) but it was too damn hard :) Maybe you want to take a stab at it, I'm welcoming improvements!
]]></content:encoded>
      <dc:date>2014-05-01T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Fixing Heartbleed</title>
      <link>https://kvz.io/fixing-heartbleed.html</link>
      <description><![CDATA[Four days ago the news about the Heartbleed got every sysadmin's attention. Renowned security expert Bruce Schneier writes:
]]></description>
      <pubDate>Fri, 11 Apr 2014 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/fixing-heartbleed.html</guid>
      <content:encoded><![CDATA[Four days ago the news about the [Heartbleed](https://heartbleed.com) got every sysadmin's attention. Renowned security expert [Bruce Schneier](https://www.schneier.com/blog/archives/2014/04/heartbleed.html) writes:

> This means that anything in memory -- SSL private keys, user keys, anything -- is vulnerable. And you have to assume that it is all compromised. All of it.
>
> "Catastrophic" is the right word. On the scale of 1 to 10, this is an 11.

<!--more-->

Using a webtool to [test for Heartbleed](https://filippo.io/Heartbleed/) it became clear that my encoding startup [Transloadit](https://transloadit.com) was also hit. Luckily since we are running [stunnel](https://www.stunnel.org/index.html) for SSL the bleed is [limited](https://www.daemonology.net/blog/2014-04-09-tarsnap-no-heartbleed-here.html), but our certificates could still be stolen. In this post I want to show how I examined and stopped the Heartbleed in Ubuntu/stunnel, but the same approach will work for Apache, Nginx, HAProxy and other programs using OpenSSL to handle secure traffic.

If you pull all your software from a package manager like APT, you may choose to just upgrade all-the-things. But if your setup or requirements are more specific here's how to go through the verification/upgrading process step by step.

The bug was in [OpenSSL](https://www.openssl.org/news/secadv_20140407.txt) which stunnel uses.
Your stunnel may either reference it as a shared library, or if statically compiled, contain it (in that case you'll have to recompile the whole stack).

To find out, I first want to know where our SSL handler is installed.

```bash
$ which stunnel
  /srv/current/stack/bin/stunnel
```

Looks like we run a version in our stack directory. Let's see what shared libraries it uses by running `ldd` on it. If we spot libssl, we can just upgrade that shared library and restart stunnel to address the Heartbleed.

```bash
$ ldd /srv/current/stack/bin/stunnel
  linux-vdso.so.1 =>  (0x00007fff8f6e7000)
  libssl.so.1.0.0 => /lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007fc73f4ab000)
  libcrypto.so.1.0.0 => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0 (0x00007f73f9c31000)
  libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f73f9a2d000)
  libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f73f9810000)
  libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f73f9451000)
  libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f73f924c000)
  libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f73f9035000)
  /lib64/ld-linux-x86-64.so.2 (0x00007f73fa272000)
```

It uses quite a few shared libraries. But notice `libssl.so` on the second line, pointing to the file `/lib/x86_64-linux-gnu/libssl.so.1.0.0`.

Let's find out if there is an APT package that is providing this affected file:

```bash
$ dpkg -S /lib/x86_64-linux-gnu/libssl.so.1.0.0
  libssl1.0.0: /lib/x86_64-linux-gnu/libssl.so.1.0.0
```

So `libssl1.0.0` is the package name providing the SSL library.
Doing a quick `apt-get update` and `apt-get install libssl1.0.0` shows there are indeed updates available, and that `libssl-dev` is coupled.

I verify that doing the upgrade solves Heartbleed on one server, and decide to roll this out in a forceful way using this script:

```bash
$ export DEBIAN_FRONTEND=noninteractive && \
  sudo -E apt-get -y update && \
  sudo -E apt-get -y install libssl-dev && \
  sudo -E service stunnel restart
```

And that fixed our Heartbleed, although we should still re-issue our certificate to make sure people who might have stolen it while we were exposed cannot impersonate us online.

If you were running an SSL handler that had access to more than just the certificates, e.g. Apache being able to access user sessions, you might want to ask your users to change their passwords.

As of late there's been quite some critique on the code quality of OpenSSL. As efforts are being made to improve, it might make sense to move out SSL handling to an isolated component such as stunnel.

Thanks for reading. I hope this post helps people better diagnose and fix Heartbleed in their setups. Let me know if you have suggestions, I'm always open to learning.
]]></content:encoded>
      <dc:date>2014-04-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>GitHub Spring Cleaning - the Deprecation Hack</title>
      <link>https://kvz.io/how-to-deprecate-projects-on-github.html</link>
      <description><![CDATA[Almost spring here! Birds are chirping and we start cleaning out our kitchens and backyards and closets and GitHub accounts. Let's trash some legacy!
]]></description>
      <pubDate>Fri, 21 Feb 2014 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/how-to-deprecate-projects-on-github.html</guid>
      <content:encoded><![CDATA[Almost spring [here](https://www.google.com/maps/place/Amsterdam)! Birds are chirping and we start cleaning out our kitchens and backyards and closets and GitHub accounts. Let's trash some legacy!

Why? Because

- We're ashamed of old code
- We want to save money by having a lower (private) repo count
- We want to improve the signal-to-noise on our profiles before a job interview
- [Spring](https://en.wikipedia.org/wiki/Spring_cleaning)

But wait, what if your co-worker wants to access some of those commits again? You probably don't feel like peeling archives from crashed backup drives in the basement of your previous building.

[Renan](https://twitter.com/renan_saddam) and I faced this at [true.nl](https://true.nl) and we started looking for simple solutions.

<!--more-->

After going over several, we (ok, Renan) came up with the idea of storing every old repository's `master` branch to a self-named branch in a single `deprecated` repository.

Here's what it might look like on 3 sample repos:

```bash
github.com/kvz/eggshell/tree/master -> github.com/kvz/deprecated/tree/eggshell
github.com/kvz/submin/tree/master   -> github.com/kvz/deprecated/tree/submin
github.com/kvz/Elastica/tree/master -> github.com/kvz/deprecated/tree/Elastica
```

Hack? Yes. But the advantages of this method are clear. You get to:

- Preserve paths, commits, users
- Use GitHub's webinterface to quickly travese the archives and link to them
- Make it very clear to people that they're looking at indeed deprecated code
- Make the deprecated repo private if need be and enjoy GitHub's access control
- Checkout a `deprecated` repo branch, force-push it to a fresh repo's `master` and be back in business

It's limited in that we only preserve the `master` branch, but we figured that would suffice for repos whose code history would otherwise just have been made inaccessible in far worse ways.

Starting is simple. You [create a container repo](https://github.com/new) on Github named `deprecated`, add it as an origin to existing repos, force push current master to a named branch, and done.

However, spring cleaning is no fun without automation, so we wrote a script to do this for you. Just change the `repo_sources` and `repo_destiny` variables.

> If you don't understand what this does, please don't run it

```bash
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail

# Repositories to deprecate and their destination
repo_user="kvz"
repo_sources=(eggshell submin Elastica)
repo_destiny="deprecated"

todo=""
myTemp=`mktemp -d -t /tmp` && cd "${myTemp}"

# Iterate the commands below for every repository you wanna merge
for repo_source in "${repo_sources[@]}"; do
  git clone git@github.com:${repo_user}/${repo_source}.git || true

  pushd ${repo_source}
    git clean -fd
    git reset --hard
    git checkout master
    git pull -f origin master
    git checkout -B ${repo_user}/${repo_source}
    git push -f git@github.com:${repo_user}/${repo_destiny}.git ${repo_user}/${repo_source}
  popd

  todo="${todo}--> Feel free to delete the repository at https://github.com/${repo_user}/${repo_source}/settings\n"
done

todo="${todo}--> Saved ${repo_sources} as branches in https://github.com/${repo_user}/${repo_destiny}/branches\n"
echo -e ${todo}
```

Let us know what you think!
]]></content:encoded>
      <dc:date>2014-02-21T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>It's Almost 2014 and We Are Still Committing Broken Code</title>
      <link>https://kvz.io/one-git-commit-hook-to-rule-them-all.html</link>
      <description><![CDATA[Dispite testcases, syntax errors still find their way into our commits.
]]></description>
      <pubDate>Sat, 21 Dec 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/one-git-commit-hook-to-rule-them-all.html</guid>
      <content:encoded><![CDATA[Dispite testcases, syntax errors still find their way into our commits.

- Maybe it was a change in that bash script that wasn't covered by tests. Too bad our deploys relied on it.
- Maybe it was just a textual change and we didn't think it was necessary to run the associated code before pushing this upstream. Too bad we missed that quote.

Whatever the reason, **it's almost 2014 and we are still committing broken code**. This needs to change because in the

- Best case: Travis or Jenkins prevent those errors from hitting production and it's frustrating to go back and revert/redo that stuff. A waste of your time and state of mind, as you already moved onto other things.
- Worst case: your error goes unnoticed and hits production.

Git offers commit hooks to prevent bad code from entering the repository, but you have to install them on a local per-project basis.

Chances are you have been too busy/lazy and never took the time/effort to whip up a commit hook that could deal with all your projects and programming languages.

That holds true for me, however I recently had some free time and decided to invest it in cooking up `ochtra`. **O**ne **C**ommit **H**ook **T**o **R**ule **A**ll.

<!--more-->

## Features

I first set out to find existing hooks, but I found all of them had caveats I wanted to avoid. For example this hook:

- Works on many languages (Ruby, JavaScript, Python, Bash, Go, and PHP)
- Works on filenames with spaces
- Works on initial commits
- Will skip files that are staged to be deleted
- Will not run when we're not currently on a branch
- Checks files as staged in Git, not how they're currently happen to be saved in your working dir
- Deals with discrepancies between linters sometimes printing errors on STDOUT vs STDERR

## The Code

Feel free to review and suggest improvements

```bash
#!/usr/bin/env bash
#
# A Git pre-commit hook that checks for syntax errors
# for: Ruby, JavaScript, Python, Bash, Go, and (Cake)PHP
# based on the extensions of staged files in Git.
# Can be 'installed globally' as of Git 1.7.1 using init.templatedir
#
# Copyright 2013, kvz (https://twitter.com/kvz)
#
# Necessary check for initial commit
against="4b825dc642cb6eb9a060e54bf8d69288fbee4904"
git rev-parse --verify HEAD >/dev/null 2>&1 && against="HEAD"

# Only run when we're on a branch (to avoid rebase hell)
# https://git-blame.blogspot.nl/2013/06/checking-current-branch-programatically.html
if branch=$(git symbolic-ref --short -q HEAD); then
  echo on branch $branch
else
  echo not on any branch
  exit 0
fi

# Takes a command as arguments and paints both it's STDOUT & STDERR in
# colors specified in first and second arguments. Use 'purge' to skip printing
# at all
function paint() (
  set -o pipefail;

  green=$'s,.*,\x1B[32m&\x1B[m,'
  red=$'s,.*,\x1B[31m&\x1B[m,'
  gray=$'s,.*,\x1B[37m&\x1B[m,'
  purge="/.*/d"

  stdout="${!1}"
  stderr="${!2}"

  ("${@:3}" 2>&1>&3 |sed ${stderr} >&2) 3>&1 \
                    |sed ${stdout}
)

# (A)dded (C)opied or (M)odified
git diff-index --cached --full-index --diff-filter=ACM $against |while read -r line; do
  sha="$(echo ${line} |cut -d' ' -f4)"
  sts="$(echo ${line} |cut -d' ' -f5)"
  pth="$(echo ${line} |cut -d' ' -f6-)"
  ext="${pth##*.}"

  she="$(git cat-file -p ${sha} |head -n1 |awk -F/ '/^#\!/ {print $NF}' |sed 's/^env //g')"
  out="purge"
  err="red"
  cmd=""
  tmp=""

  # Select linting tool based on extension or shebang
  if [ "${ext}" = "rb" ] || [ "${ext}" = "erb" ] || [ "${she}" = "ruby" ]; then
    cmd="ruby -c -"
  elif [ "${ext}" = "js" ] || [ "${she}" = "node" ]; then
    # jshint unfortunately uses STDOUT for errors, so paint all red
    cmd="jshint -"
    out="red"
  elif [ "${ext}" = "coffee" ] || [ "${she}" = "coffee" ]; then
    # coffeelint unfortunately uses STDOUT for errors, so paint all red
    cmd="coffeelint --stdin"
    out="red"
  elif [ "${ext}" = "py" ] || [ "${she}" = "python" ]; then
    tmp="${TMPDIR:-/tmp}/${$}.${ext}"
    cmd="pylint --errors-only ${tmp}"
    out="red"
  elif [ "${ext}" = "go" ]; then
    cmd="gofmt -e"
  elif [ "${she}" = "bash" ]; then
    cmd="bash -n"
  elif [ "${she}" = "sh" ]; then
    cmd="sh -n"
  elif [ "${ext}" = "php" ] || [ "${ext}" = "ctp" ] || [ "${she}" = "php" ]; then
    cmd="php -n -l -ddisplay_errors=1 -derror_reporting=E_ALL -dlog_errrors=0"
  elif [ "${ext}" = "pl" ] || [ "${she}" = "perl" ]; then
    cmd="perl -wc -"
  fi

  if [ -n "${cmd}" ]; then
    tool="$(echo "${cmd}" |cut -d' ' -f1)"
    paint "gray" "red" echo "--> ${tool} syntax checking for ${pth}"
  else
    paint "gray" "red" echo "--> No syntax checking for ${pth}"
    continue
  fi

  # require linting tool
  if ! which "${tool}" >/dev/null 2>&1; then
    paint "red" "red" echo "Please install ${tool} for pre-commit syntax checking. "
    exit 1
  fi

  # execute on staged buffer
  [ -n "${tmp}" ] && git cat-file -p ${sha} > "${tmp}"

  if ! git cat-file -p ${sha} |paint ${out} ${err} ${cmd}; then
    paint "red" "red" echo "Please fix ${tool} syntax errors and type 'git add ${pth}'"
    [ -n "${tmp}" ] && rm -f "${tmp}"
    exit 1
  fi

  [ -n "${tmp}" ] && rm -f "${tmp}"

  paint "green" "red" echo "No ${tool} syntax errors detected in '${pth}'"
done
```

This file is being maintained on [Github](https://github.com/kvz/ochtra/blob/master/pre-commit) and could be more up to date there.

## Test

Without installing anything, you can try `ochtra` on a local repository:

```bash
$ mkdir my-project && cd $_
$ git init
$ echo ";-)" > syntax-error.go
$ git add syntax-error.go
$ curl -s https://raw.github.com/kvz/ochtra/master/pre-commit |bash
```

This will showcase `ochtra` on a staged Go file with syntax errors without having to install or commit anything.

If you want to test `ochtra` against all languages, you can run the test suite:

```bash
git clone https://github.com/kvz/ochtra.git
cd ochtra
make test
```

## Install

As of Git 1.7.1, you can use the `init.templatedir` to store hooks that you want present in all your repositories. These files will be copied from e.g. `~/.gittemplate/` into your
current repo's `.git` dir upon every `git init`.

This also works for existing repos, and it will not overwrite files already present.

To install the pre-commit template type

```bash
$ mkdir -p ~/.gittemplate/hooks
$ curl https://raw.github.com/kvz/ochtra/master/pre-commit -o ~/.gittemplate/hooks/pre-commit \
 && chmod u+x $_
$ git config --global init.templatedir '~/.gittemplate'
```

The template is just sitting there. To install the hook into new (or existing!) repos, type

```bash
$ git init
```

From now on, any file you are about to commit will first be checked for syntax errors.

If you ever update your template you can type

```bash
$ rm .git/hooks/pre-commit && git init
```

## Tips

- If you ever want to commit code and disable the pre-commit one time, type

```bash
$ git commit -n
```

This can be useful if you import big chunks of code that don't pass jshint yet.

## Feedback

It's a work in progress and I would like your feedback on this to make it harder, better, faster, stronger and have it support more languages. Our work is never over :)

Leave a comment here or let's collaborate on [Github](https://github.com/kvz/ochtra)

## Thanks To

These pages have been a great source of inspiration when building `ochtra`:

- <https://mark-story.com/posts/view/using-git-commit-hooks-to-prevent-stupid-mistakes>
- <https://stackoverflow.com/a/8842663/151666>
- <https://github.com/phpbb/phpbb/blob/develop-olympus/git-tools/hooks/pre-commit>
]]></content:encoded>
      <dc:date>2013-12-21T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Make Your MySQL Tables Strict</title>
      <link>https://kvz.io/make-your-mysql-table-strict.html</link>
      <description><![CDATA[When you're upgrading to MySQL 5.6 you may notice strict mode is turned
on by default. You can disable it, but now might be a good time to
get your schemas strict, to ensure smooth upgrade paths in the future.
]]></description>
      <pubDate>Mon, 02 Dec 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/make-your-mysql-table-strict.html</guid>
      <content:encoded><![CDATA[When you're upgrading to MySQL 5.6 you may notice strict mode is turned
on by default. You can disable it, but now might be a good time to
get your schemas strict, to ensure smooth upgrade paths in the future.

One particularly common failure, is columns that:

- have no default (e.g. `""`)
- do not allow `NULL`

Now when MySQL has to create a record in which you have omitted such a
field, it has to guess what to store there. For VARCHARS it'll save `""`,
for integers `0`.

Guesswork is best not left to your database engine as this could lead to
ambiguity and unexpected results, which is why when strict mode is on
(you can check with `SELECT @@sql_mode`),
MySQL will error out:

```bash
mysql> INSERT INTO accounts VALUES();
ERROR 1364 (HY000): Field 'name' doesn't have a default value
```

I have found that the least intrusive way to make my schemas strict
is to allow `NULL` (but your mileage may vary).

If you want to set all problematic field to allow `NULL`, I wrote
a query that generates the appropriate statements:

```sql
SELECT
 CONCAT('ALTER TABLE `', TABLE_NAME, '` MODIFY `', COLUMN_NAME, '` ', COLUMN_TYPE, '; ') as strict_schema_changes
FROM `columns`
WHERE 1=1
 AND IS_NULLABLE = 'NO'
 AND COLUMN_DEFAULT IS NULL
 AND TABLE_SCHEMA= 'transloadit';
 -- You'll have to change `transloadit` to your database name.
```

This will for instance generate:

```sql
ALTER TABLE `accounts` MODIFY `name` varchar(200);
ALTER TABLE `accounts` MODIFY `company` varchar(70);
ALTER TABLE `assemblies` MODIFY `updated` datetime;
ALTER TABLE `blog_posts` MODIFY `title` varchar(256);
ALTER TABLE `countries` MODIFY `is_eu` tinyint(1);
ALTER TABLE `credits` MODIFY `created` datetime;
ALTER TABLE `invoices` MODIFY `to_vat_id` varchar(30);
```

Note that allowing `NULL` is the default when modifying columns
(as opposed to specifying `NOT NULL` in your declaration).

Obviously you'll need to carefully test your app and revert your
migrations if anything breaks, but this could give you a head start.

Hope this helps!
]]></content:encoded>
      <dc:date>2013-12-02T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Best Practices for Writing Bash Scripts</title>
      <link>https://kvz.io/bash-best-practices.html</link>
      <description><![CDATA[
  This project now has its own homepage at bash3boilerplate.sh.

]]></description>
      <pubDate>Thu, 21 Nov 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/bash-best-practices.html</guid>
      <content:encoded><![CDATA[> This project now has its own homepage at [bash3boilerplate.sh](https://bash3boilerplate.sh).

I recently tweeted a few best practices that I picked up over the years
and got some good feedback. I decided to write them all
down in a blogpost. Here goes

1. Use long options (`logger --priority` vs `logger -p`). If you're on cli, abbreviations make sense for efficiency. but when you're writing reusable scripts a few extra keystrokes
   will pay off in readability and avoid ventures into man pages in the future by you or your collaborators.

1. Use `set -o errexit` (a.k.a. `set -e`) to make your script exit when a command fails.

1. Then add `|| true` to commands that you allow to fail.

1. Use `set -o nounset` (a.k.a. `set -u`) to exit when your script tries to use undeclared variables.

1. Use `set -o xtrace` (a.k.a `set -x`) to trace what gets executed. Useful for debugging.

1. Use `set -o pipefail` in scripts to catch `mysqldump` fails in e.g. `mysqldump |gzip`. The exit status of the last command that threw a non-zero exit code is returned.

1. `#!/usr/bin/env bash` is more portable than `#!/bin/bash`.

1. Avoid using `#!/usr/bin/env bash -e` (vs `set -e`), because when someone runs your script as `bash ./script.sh`, the exit on error will be ignored.

1. Surround your variables with `{}`. Otherwise bash will try to access the `$ENVIRONMENT_app` variable in `/srv/$ENVIRONMENT_app`, whereas you probably intended `/srv/${ENVIRONMENT}_app`.

1. You don't need two equal signs when checking `if [ "${NAME}" = "Kevin" ]`.

1. Surround your variable with `"` in `if [ "${NAME}" = "Kevin" ]`, because if `$NAME` isn't declared, bash will throw a syntax error (also see `nounset`).

1. Use `:-` if you want to test variables that could be undeclared. For instance: `if [ "${NAME:-}" = "Kevin" ]` will set `$NAME` to be empty if it's not declared. You can also set it to `noname` like so `if [ "${NAME:-noname}" = "Kevin" ]`

1. Set magic variables for current file, basename, and directory at the top of your script for convenience.

Summarizing, why not start your next bash script like this:

```bash
#!/usr/bin/env bash
# Bash3 Boilerplate. Copyright (c) 2014, kvz.io

set -o errexit
set -o pipefail
set -o nounset
# set -o xtrace

# Set magic variables for current file & dir
__dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
__file="${__dir}/$(basename "${BASH_SOURCE[0]}")"
__base="$(basename ${__file} .sh)"
__root="$(cd "$(dirname "${__dir}")" && pwd)" # <-- change this as it depends on your app

arg1="${1:-}"
```

If you have additional tips, please share and I'll update this post.
]]></content:encoded>
      <dc:date>2013-11-21T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>File Uploading Without a Server</title>
      <link>https://kvz.io/file-uploading-without-serverside-code.html</link>
      <description><![CDATA[More and more sites are written in flat HTML. Hosted on GitHub pages,
S3, etc. The advantages are clear: ridiculously low to no hosting costs, it can
hardly ever break, and with things like Jekyll and Octopress
it can still be fun to maintain. And with JavaScript frameworks such as Angular you
could build entire apps clientside. The downsides are clear too: no central point of knowledge makes
interaction between users hard.
]]></description>
      <pubDate>Tue, 03 Sep 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/file-uploading-without-serverside-code.html</guid>
      <content:encoded><![CDATA[More and more sites are written in flat HTML. Hosted on GitHub pages,
S3, etc. The advantages are clear: ridiculously low to no hosting costs, it can
hardly ever break, and with things like Jekyll and [Octopress](/blog/2012/09/25/blog-with-octopress/)
it can still be fun to maintain. And with JavaScript frameworks such as Angular you
could build entire apps clientside. The downsides are clear too: no central point of knowledge makes
interaction between users hard.

However with services like [Disqus](https://disqus.com), and (my own startup) [Transloadit](https://transloadit.com), it gets more
and more feasible to just run a flat site and have external services cover for not
running serverside code and a database yourself.

In this post I'm going to show you how easy it is to make file uploading possible
even if your site is just a single page of HTML.

<!--more-->

## Not Super Secure

Now this certainly is not very secure. Although your S3 credentials will be encrypted
inside Transloadit's account, very little will stop wrong-doers from filling up your S3
bucket by re-using references they can find in the HTML code. They won't be able to delete
or change existing files in your bucket, but it's still annoying-to-extremely-harmful,
depending on what you are building.

For a few usecases (a 4chan clone / intranet page / etc) it might be an *okay* tradeoff.
If not, you'll have to enrich this
example with [signatures](https://transloadit.com/docs/api-docs#authentication),
but that will require serverside code to shield off secret keys from prying eyes.

## Storage

In this example, we'll use Amazon S3 buckets to store the files, but Transloadit also
supports Rackspace Cloudfiles and (s)FTP as storage targets.

## Let's do it

First of all

1. [Sign up for Amazon AWS](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html)
1. [Create an S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html) at Amazon. Write down your `Bucket Name`
1. [Generate Security Keys](https://console.aws.amazon.com/iam/home?#security_credential) at Amazon. Write down both the `Access Key ID` and the `Secret Access Key`
1. [Signup for Transloadit](https://transloadit.com/pricing). Write down your `API Key` [found here](https://transloadit.com/accounts/credentials)
1. [Create a Template](https://transloadit.com/templates/add), name it `just_save_to_s3`, write down the `template_id`, and save the following content in it:

```javascript
{
  "steps": {
    "export": {
      "robot" : "/s3/store",
      "key"   : "AMAZON_ACCESS_KEY_ID",
      "secret": "AMAZON_SECRET_ACCESS_KEY",
      "bucket": "AMAZON_BUCKET_NAME"
    }
  }
}
```

As you can see, this Transloadit Template now knows how to store any uploads thrown at it.

Now just create the HTML page and refer to the `template_id` that contains the encoding / uploading
instructions:

```markup
<!doctype html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Transloadit Demo</title>
  <link rel="stylesheet" href="//netdna.bootstrapcdn.com/bootstrap/3.0.0/css/bootstrap.min.css">
</head>
<body>
  <div class="container">
    <h1>Transloadit Demo</h1>

    <p>
      This method works without any serverside code, so you could also just store a flat
      HTML page on S3. However, please note that this is <strong>not very secure</strong>.
    </p>

    <!-- Transloadit code starts here -->
    <form id="transloadit" action="index.html?upload=complete" enctype="multipart/form-data" method="POST">
      <input type="file" name="my_file" />
    </form>

    <script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
    <script src="//assets.transloadit.com/js/jquery.transloadit2-v2-latest.js"></script>
    <script type="text/javascript">
       $(function() {
         $('form#transloadit').transloadit({
            triggerUploadOnFileSelection: true,
            wait: true,
            params: {
              auth: {
                key: "TRANSLOADIT_AUTH_KEY"
              },
              template_id: "TRANSLOADIT_TEMPLATE_ID"
            }
         });
       });
    </script>
    <!-- Transloadit code ends here -->

    <script src="//netdna.bootstrapcdn.com/bootstrap/3.0.0/js/bootstrap.min.js"></script>
  </div>
</body>
</html>
```

And that's it. An upload button will appear and all files uploaded will be added to your bucket!

Now if you need to perform extra operations on the uploads (like extracting thumbnails from video,
extracting the contents of a zip, etc), just add some `steps` to your template. Here are some [examples](https://transloadit.com/demos).

Again, note that this is not the best/recommended way to use Transloadit, but after hearing
from customers who had a usecase for this, I thought I'd share :)
]]></content:encoded>
      <dc:date>2013-09-03T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Yesterday I Wrote My First Firefox OS App</title>
      <link>https://kvz.io/yesterday-i-wrote-my-first-firefox-os-app.html</link>
      <description><![CDATA[Yesterday I wrote my first Firefox OS App.
]]></description>
      <pubDate>Mon, 12 Aug 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/yesterday-i-wrote-my-first-firefox-os-app.html</guid>
      <content:encoded><![CDATA[Yesterday I wrote my first Firefox OS App.

<!--more-->

For now it's called [kbt2](https://github.com/kvz/kbt2) and it's a round timer that I can use to give
kickboxing lessons.

After:

- a few very frustrating hours dealing with the unintuitive and sometimes even failing Everlast Round Timer
- knowing that I could not use my own phone as it will be playing music during kickboxing sessions
- having a spare [Firefox Developer Preview Phone](https://marketplace.firefox.com/developers/dev_phone) thanks
  to [Sergi Mansilla](https://twitter.com/sergimansilla) and a lucky raffle on a [Decode Friday meetup](https://www.meetup.com/DecodeFriday/events/119848052/)
- knowing that building Firefox OS apps is as easy as creating a HTML site with some json inside a `./manifest.webapp` for app definition, and JavaScript calls to make it e.g. vibrate

.. I decided to use my geekphone as a dedicated interval timer / instruction guide and started hacking on an app for that. It all went remarkibly smooth.

Here's the phone:

![keon_mobile01](/assets/images/posts/1bd6bd98-033e-11e3-8ced-c2111b3d2fac.jpg)

It's a "Keon" Developer Preview by [Geekphone](https://www.geeksphone.com/). I was lucky to win one, but told they'll only be 50$.

You point it's webbrowser to the location of your app. Your app can detect the phone and offer an install via a simple `navigator.mozApps.install()`

![screen shot 2013-08-12 at 12 14 43 pm](/assets/images/posts/18f77e56-0338-11e3-8f45-0cdc2742907d.png)

This basically copies all the the assets listed in `./manifest.appcache` to your phone, so it can be accessed
without internet (awesome cause there's bad reception inside the gym :)

![screen shot 2013-08-12 at 12 14 55 pm](/assets/images/posts/1562ef1e-0338-11e3-8ac0-8f427f227701.png)

Now just launch the app

![screen shot 2013-08-12 at 12 02 21 pm](/assets/images/posts/4cc0affc-0336-11e3-9f27-0dc043aefe6d.png)

And that's it. I hacked this up on a rainy Sunday afternoon thanks to a headstart with:

- Great [docs on the Mozilla Developer Network](https://developer.mozilla.org/en-US/docs/Web/Apps/Getting_Started)
- The [Firefox OS Boilerplate App](https://github.com/robnyman/Firefox-OS-Boilerplate-App) which bundles some common code
- The [Firefox OS Simulator 4.0](https://addons.mozilla.org/en-US/firefox/addon/firefox-os-simulator/) which let me test & refresh with just 1 click

Obviously this particular project is quite specific to my use-case; but still open sourced for inspirational purposes.

The first Firefox Phones are targetted at upcoming markets so featurewise can't really compete with - and wont't replace - your modernday iOS/Android devices.

However, at just 50$ you do get a considerable amount of hardware:

- 1Ghz CPU
- 512 RAM
- GPS. Wifi N/UMTS/GSM reception
- 3.5" HVGA touchscreen (!)
- 3 mega pixel camera
- Light & proximity sensor. G-Sensor
- USB
- 1580 mAh Battery

.. That you can easily talk to via [JavaScript APIs](https://developer.mozilla.org/en-US/docs/Web/API).
Just imagine what other dedicated applications you could build on top of this :) Be it:

- a remote controller
- the brain of a robot that you're building
- a security device taking pictures when it detects changes in light
- in your car, uploading G-forces & GPS to track when you've been driving most economically :)

For some things a [Raspberry PI](https://www.raspberrypi.org/) or [Arduino](https://www.arduino.cc/) makes more sense, but since this has a touchscreen, solid housing, extra sensors, and the platform is fully open too, I see a lot of possibilites.

**Update #1**

The Keon I won will be sold at [91 EUR](https://shop.geeksphone.com/en/phones/1-keon.html), not 50$ as I mentioned. However, ZTE will launch a 79$ Firefox phone on Ebay this Friday.
]]></content:encoded>
      <dc:date>2013-08-12T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Fix Vagrant Box Hanging at Boot</title>
      <link>https://kvz.io/fix-vagrant-box-hanging-at-boot.html</link>
      <description><![CDATA[Sometimes it happens that vagrant hangs during boot of your virtual image. Right after typing:
]]></description>
      <pubDate>Thu, 25 Jul 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/fix-vagrant-box-hanging-at-boot.html</guid>
      <content:encoded><![CDATA[Sometimes it happens that vagrant hangs during boot of your virtual image. Right after typing:

```bash
$ vagrant up
```

It hangs for a long time and then finally throws:

```bash
[default] Failed to connect to VM!
Failed to connect to VM via SSH. Please verify the VM successfully booted
by looking at the VirtualBox GUI.
```

If you open VirtualBox you'll see that the virtual machine preview shows a black
screen with kernels to choose from. This is GRUB requiring user input to boot further.

Here's how to fix that.

<!--more-->

First, get the machine id

```bash
$ # Before v1.1
$ # MACHINE_ID=$(awk -F\" '{print $6}' .vagrant)
$ # After v1.1
$ MACHINE_ID=$(cat .vagrant/machines/default/virtualbox/id)
```

Power off the VM

```bash
$ VBoxManage controlvm ${MACHINE_ID} poweroff
```

Then boot the machine with a GUI console

```bash
$ VBoxManage startvm ${MACHINE_ID}
```

Wait for it to boot, login, run:

```bash
$ sudo update-grub
```

When successful, shut it down

```bash
$ VBoxManage controlvm ${MACHINE_ID} poweroff
```

And after this, `vagrant up` will just boot your VM normally :)
]]></content:encoded>
      <dc:date>2013-07-25T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Deploy to a Dynamic Serverlist With Capistrano</title>
      <link>https://kvz.io/deploy-to-variable-targets-with-capistrano.html</link>
      <description><![CDATA[At our company we use Capistrano for deploys. It reads Ruby instructions
from a ./Capfile in the project's root directory, then deploys
accordingly via SSH. It has support for releases, shared log dirs, rollbacks,
rsync vs remote cached git deploys, etc. It can be run from any machine
that has access to your production servers. Be it your workstation, or a
Continuous Integration server.
]]></description>
      <pubDate>Mon, 15 Jul 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/deploy-to-variable-targets-with-capistrano.html</guid>
      <content:encoded><![CDATA[At our company we use Capistrano for deploys. It reads Ruby instructions
from a `./Capfile` in the project's root directory, then deploys
accordingly via SSH. It has support for releases, shared log dirs, rollbacks,
rsync vs remote cached git deploys, etc. It can be run from any machine
that has access to your production servers. Be it your workstation, or a
Continuous Integration server.

So all in all pretty convenient but typically it assumes you know what servers you
want to deploy to at the time of writing your `Capfile`.

What if the composition of your platform changes often? Will you keep changing
the `Capfile` right before every deploy? Seems like effort ; )

<!--more-->

## Dynamic Configuration of Deploy Targets

Here's how a snippet of a handwritten `Capfile` might look like:

```ruby
# Static Capistrano targets
role :app,
  "server1.example.com",
  "server2.example.com"
```

But if you have a highly volatile cloudplatform where servers come and go,
you probably don't want to edit your `Capfile` to reflect what's currently
in production with every deploy.

There are probably better ways to write the replacement since my Ruby-fu is limited,
but assuming you keep a `serverlist.sh` script that prints each of the current
hostnames of your platform on a new line (e.g. by using your hosting provider's API),
you could define your `:app` role dynamically like so:

```ruby
# Dynamic Capistrano targets
hostnames = run_locally "./variable_server_list.sh"
hostnames = hostnames.split("\n")
for hostname in hostnames
	server hostname, :app
end
```

And this will make Capistrano deploy to target all active servers in production.

Hope this helps!
]]></content:encoded>
      <dc:date>2013-07-15T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Prefix Streaming stdout & stderr in Go</title>
      <link>https://kvz.io/prefix-streaming-stdout-and-stderr-in-golang.html</link>
      <description><![CDATA[If you are writing code in Go and are executing a lot of (remote) commands,
you may want to indent all of their
output, prefix the loglines with hostnames, or mark anything that was thrown to stderr
red, so you can spot errors more easily.
]]></description>
      <pubDate>Fri, 12 Jul 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/prefix-streaming-stdout-and-stderr-in-golang.html</guid>
      <content:encoded><![CDATA[If you are writing code in Go and are executing a lot of (remote) commands,
you may want to indent all of their
output, prefix the loglines with hostnames, or mark anything that was thrown to `stderr`
red, so you can spot errors more easily.

For this purpose I wrote Logstreamer.

<!--more-->

You pass 3 arguments to `NewLogstreamer()`:

- Your `*log.Logger`
- Your desired prefix (`"stdout"` and `"stderr"` prefixed have special meaning)
- If the lines should be recorded `true` or `false`. This is useful if you want to retrieve any errors.

This returns an interface that you can point `exec.Command`'s `cmd.Stderr` and `cmd.Stdout` to.
All bytes that are written to it are split by newline and then prefixed to your specification.

## Test

```bash
$ cd src/pkg/logstreamer/
$ go test
```

Here I issue two local commands, `ls -al` and `ls nonexisting`:

![screen shot 2013-07-02 at 2 48 33 pm](/assets/images/posts/16177cf0-e316-11e2-8dc6-320f52f71442.png)

Over at [Transloadit](https://transloadit.com) we use it to prefix streaming remote command output.
Servers stream command output over SSH back to me, and every line is prefixed with a date, their hostname & marked red in case they
wrote to `stderr`. Makes it really easy to spot errors & their origin.

The project is [hosted on GitHub](https://github.com/kvz/logstreamer), but
here's a snippet if you want to dive right in.

```go
package logstreamer

import (
	"bytes"
	"io"
	"os"
	"log"
)

type Logstreamer struct {
	Logger    *log.Logger
	buf       *bytes.Buffer
	readLines string
	// If prefix == stdout, colors green
	// If prefix == stderr, colors red
	// Else, prefix is taken as-is, and prepended to anything
	// you throw at Write()
	prefix string
	// if true, saves output in memory
	record  bool
	persist string

	// Adds color to stdout & stderr if terminal supports it
	colorOkay  string
	colorFail  string
	colorReset string
}

func NewLogstreamer(logger *log.Logger, prefix string, record bool) *Logstreamer {
	streamer := &Logstreamer{
		Logger:     logger,
		buf:        bytes.NewBuffer([]byte("")),
		prefix:     prefix,
		record:     record,
		persist:    "",
		colorOkay:  "",
		colorFail:  "",
		colorReset: "",
	}

	if os.Getenv("TERM") == "xterm" {
		streamer.colorOkay  = "\x1b[32m"
		streamer.colorFail  = "\x1b[31m"
		streamer.colorReset = "\x1b[0m"
	}

	return streamer
}

func (l *Logstreamer) Write(p []byte) (n int, err error) {
	if n, err = l.buf.Write(p); err != nil {
		return
	}

	err = l.OutputLines()
	return
}

func (l *Logstreamer) Close() error {
	l.Flush()
	l.buf = bytes.NewBuffer([]byte(""))
	return nil
}

func (l *Logstreamer) Flush() error {
	var p []byte
	if _, err := l.buf.Read(p); err != nil {
		return err
	}

	l.out(string(p))
	return nil
}

func (l *Logstreamer) OutputLines() (err error) {
	for {
		line, err := l.buf.ReadString('\n')
		if err == io.EOF {
			break
		}
		if err != nil {
			return err
		}

		l.readLines += line
		l.out(line)
	}

	return nil
}

func (l *Logstreamer) ResetReadLines() {
	l.readLines = ""
}

func (l *Logstreamer) ReadLines() string {
	return l.readLines
}

func (l *Logstreamer) FlushRecord() string {
	buffer := l.persist
	l.persist = ""
	return buffer
}

func (l *Logstreamer) out(str string) (err error) {
	if l.record == true {
		l.persist = l.persist + str
	}

	if l.prefix == "stdout" {
		str = l.colorOkay + l.prefix + l.colorReset + " " + str
	} else if l.prefix == "stderr" {
		str = l.colorFail + l.prefix + l.colorReset + " " + str
	} else {
		str = l.prefix + str
	}

	l.Logger.Print(str)

	return nil
}
```

```go
package logstreamer

import (
	"log"
	"os"
	"os/exec"
	"testing"
	"fmt"
)

func TestLogstreamerOk(t *testing.T) {
	// Create a logger (your app probably already has one)
	logger := log.New(os.Stdout, "--> ", log.Ldate|log.Ltime)

	// Setup a streamer that we'll pipe cmd.Stdout to
	logStreamerOut := NewLogstreamer(logger, "stdout", false)
	// Setup a streamer that we'll pipe cmd.Stderr to.
	// We want to record/buffer anything that's written to this (3rd argument true)
	logStreamerErr := NewLogstreamer(logger, "stderr", true)

	// Execute something that succeeds
	cmd := exec.Command(
		"ls",
		"-al",
	)
	cmd.Stderr = logStreamerErr
	cmd.Stdout = logStreamerOut

	// Reset any error we recorded
	logStreamerErr.FlushRecord()

	// Execute command
	err := cmd.Start()

	// Failed to spawn?
	if err != nil {
		t.Fatal("ERROR could not spawn command.", err.Error())
	}

	// Failed to execute?
	err = cmd.Wait()
	if err != nil {
		t.Fatal("ERROR command finished with error. ", err.Error(), logStreamerErr.FlushRecord())
	}
}

func TestLogstreamerErr(t *testing.T) {
	// Create a logger (your app probably already has one)
	logger := log.New(os.Stdout, "--> ", log.Ldate|log.Ltime)

	// Setup a streamer that we'll pipe cmd.Stdout to
	logStreamerOut := NewLogstreamer(logger, "stdout", false)
	// Setup a streamer that we'll pipe cmd.Stderr to.
	// We want to record/buffer anything that's written to this (3rd argument true)
	logStreamerErr := NewLogstreamer(logger, "stderr", true)

	// Execute something that succeeds
	cmd := exec.Command(
		"ls",
		"nonexisting",
	)
	cmd.Stderr = logStreamerErr
	cmd.Stdout = logStreamerOut

	// Reset any error we recorded
	logStreamerErr.FlushRecord()

	// Execute command
	err := cmd.Start()

	// Failed to spawn?
	if err != nil {
		logger.Print("ERROR could not spawn command. ")
	}

	// Failed to execute?
	err = cmd.Wait()
	if err != nil {
		fmt.Printf("Good. command finished with %s. %s. \n", err.Error(), logStreamerErr.FlushRecord())
	} else {
		t.Fatal("This command should have failed")
	}
}
```
]]></content:encoded>
      <dc:date>2013-07-12T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Loosely Typed Code Deserves Triple Equality</title>
      <link>https://kvz.io/change-your-codebase-to-use-triple-equality.html</link>
      <description><![CDATA[In loosely typed languages such as JavaScript or PHP, using ==
to compare values is bad practice because it doesn't
account for type, hence false == 0 == '' == null == undefined, etc.
And you may accidentally match more than you bargained for.
]]></description>
      <pubDate>Tue, 23 Apr 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/change-your-codebase-to-use-triple-equality.html</guid>
      <content:encoded><![CDATA[In loosely typed languages such as JavaScript or PHP, using `==`
to compare values is bad practice because it doesn't
account for type, hence `false == 0 == '' == null == undefined`, etc.
And you may accidentally match more than you bargained for.

If you want you can limit unintented effects & bugs this may lead to,
it's often wise to use `===`.

In the process of converting legacy
codebases to use these triple equality operators, I find that as a rule
of thumb you can almost **always force triple equality** in case of
comparing variables against **non-numerical strings**.

There's just never a case where you want the text `'Kevin'`
to pass for the boolean `true`, or the number `3`.
And if that can still happen in your legacy codebase,
you'll want to limit those risks rather sooner than later. Even if that
breaks things that now accidentally, work.

<!--more-->

Switching legacy code that compares against numerical strings, numbers, or
booleans is trickier.
For instance user input often arrives as a string, so
bluntly changing `age == 4` to `age === 4`, will now no longer match the user input `age`,
and introduce more problems than it solves. Addressing these cases needs more thought & care.

To change all of the text-compare cases, here's a regex that can, to a degree, automate
this otherwise much more painful process.

```bash
([0-9a-zA-Z\_]\s+[!=])=(\s+['"][^0-9\-\.])
$1==$2
```

You could use this in a programs like Vim or [Araxis](https://www.araxis.com/replace-in-files/index-eur.html),
or escape it and then use it in `sed`. Make sure the changes are under source control and reviewed & tested
before committed & pushed.

I tested this on legacy JavaScript and PHP projects. Here you can see the potential
[issues caught in php.js](https://github.com/kvz/phpjs/commit/d60549d5ec65f1ca63acb33a534616d58f47a4c4)

Improvements welcome.
]]></content:encoded>
      <dc:date>2013-04-23T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Scrape All Text From a Domain</title>
      <link>https://kvz.io/obtain-all-text-from-your-website.html</link>
      <description><![CDATA[Here are some commands to download the most important pages of your
site as plain text (determined by MAX_DEPTH), and save it into one
big &lt;DOMAIN&gt;.txt file.
]]></description>
      <pubDate>Fri, 19 Apr 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/obtain-all-text-from-your-website.html</guid>
      <content:encoded><![CDATA[Here are some commands to download the most important pages of your
site as plain text (determined by `MAX_DEPTH`), and save it into one
big `<DOMAIN>.txt` file.

This could come in handy when you want to have everything checked for
grammar & spelling errors.

After the spellcheck you'd still have to search through your
codebase / database to find & fix the culprits, but this should already save
you some time in discovery.

<!--more-->

```bash
#!/usr/bin/env bash -e
#
# Downloads a site's text to 1 text file, so you can easily
# have it grammer/spellchecked
#
# Requires: wget, html2text
# Recommened: pandoc vs html2text
# Improve at: https://kvz.io/blog/2013/04/19/obtain-all-text-from-your-website/
#

[ -z "${DOMAIN}" ]    && echo "Cannot continue without DOMAIN. " && exit 1
[ -z "${EXCLUDE}" ]   && EXCLUDE="*.css,*.js,*.rss,*.xml,*.png,*.jpg,*.jpeg,*.gif,*.flv,*.swf,*.mp4,*.mov,*.mp3,*.wav"
[ -z "${MAX_DEPTH}" ] && MAX_DEPTH="1"
[ -z "${OUTPUT}" ]    && OUTPUT="./${DOMAIN}.txt"
[ -z "${TMPDIR}" ]    && TMPDIR="/tmp"
[ -z "${TXTENGINE}" ] && [ -x "$(which pandoc)" ]    && TXTENGINE="pandoc +RTS -K16m -RTS -t markdown -o- -f html -i"
[ -z "${TXTENGINE}" ] && [ -x "$(which html2text)" ] && TXTENGINE="html2text -nobs"
[ -z "${TXTENGINE}" ] && echo "Cannot continue without pandoc or html2text. " && exit 1

wget \
  --adjust-extension \
  --convert-links \
  --directory-prefix="${TMPDIR}" \
  --level="${MAX_DEPTH}" \
  --no-parent \
  --recursive \
  --reject="${EXCLUDE}" \
  --restrict-file-names=windows,lowercase \
"https://${DOMAIN}"

[ -f "${OUTPUT}" ] && rm -f "${OUTPUT}"
find "${TMPDIR}/${DOMAIN}" -type f -print0 -name '*.html' \
  |while read -d $'\0' file; do
  echo "imported by ${0} from: $(echo "${file}" |sed "s@^${TMPDIR}/${DOMAIN}@@")" >> "${OUTPUT}"
  echo "==================================" >> "${OUTPUT}"
  ${TXTENGINE} "${file}" >> "${OUTPUT}"
  echo -e "\n\n\n\n" >> "${OUTPUT}"
done

echo ""
echo " Combined text file ready in ${OUTPUT}"
echo " To cleanup after this script, type: rm -rf \"${TMPDIR}/${DOMAIN}\""
echo ""
```

Required:

```bash
$ brew install wget html2text # or apt-get install wget html2text
```

Run it:

```bash
$ DOMAIN=kvz.io ./obtain_site_text.sh
```

Recommended:

If you can install [Pandoc](https://johnmacfarlane.net/pandoc/README.html),
the resulting text output will be in Markdown and of much higher quality.

Improvements are more than welcome!
]]></content:encoded>
      <dc:date>2013-04-19T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Migrate Redis Keys Without RDB Files</title>
      <link>https://kvz.io/migrate-redis-data-without-filesystem-access.html</link>
      <description><![CDATA[Recently we moved the Transloadit status page
from an unmanaged EC2 instance to the Nodejitsu platform.
We kept status uptime history in redis, and obviously I wanted to preserve that
data.
]]></description>
      <pubDate>Tue, 16 Apr 2013 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/migrate-redis-data-without-filesystem-access.html</guid>
      <content:encoded><![CDATA[Recently we moved the [Transloadit status page](https://status.transloadit.com)
from an unmanaged EC2 instance to the Nodejitsu platform.
We kept status uptime history in redis, and obviously I wanted to preserve that
data.

For the new setup I did not have access to the filesystem, I only had a redis
port to talk to. So instead of rsyncing the `.rdb` file I used Redis replication
to migrate the data between instances.

<!--more-->

Here's the redacted terminal transcript, the sending host is
old-redis.transloadit.com, the receiving host is new-redis.transloadit.com.

```bash
$ # First let's connect to the new host
$ redis-cli -h new-redis.transloadit.com
redis new-redis.transloadit.com:6379> AUTH ...:...
OK

# Let's start with a clean slate, first make sure it stands alone
redis new-redis.transloadit.com:6379> SLAVEOF NO ONE
OK

# WARNING! This will trash all existing data on the receiving host
redis new-redis.transloadit.com:6379> FLUSHALL
OK

# Now setup replication so that redis copies all keys from old to new
redis new-redis.transloadit.com:6379> SLAVEOF old-redis.transloadit.com 6379
OK

# Wait until INFO shows `role:slave`, and the
# same amount of keys as on the old server.
redis new-redis.transloadit.com:6379> INFO
...
role:slave
...
OK

# Check some random keys to see if you have all the data.
redis new-redis.transloadit.com:6379> LRANGE history 0 1
 1) "{\"message\":\"All services operational\",\"date\":\"2013-04-15\"}"
OK

# Looks good, now let's disable slave mode, make it stand alone again
redis new-redis.transloadit.com:6379> SLAVEOF NO ONE
OK
```

And that's it, you've copied all the redis keys without using `.rdb` files.

Hope this helps, let me know if you have suggestions.
]]></content:encoded>
      <dc:date>2013-04-16T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Let's Make DNS Outage Suck Less</title>
      <link>https://kvz.io/poormans-way-to-decent-dns-failover.html</link>
      <description><![CDATA[Unfortunately the Linux DNS resolver has no direct support for detecting and doing failovers for DNS servers. It keeps feeding requests to your primary resolving nameserver, waits for a configured timeout, attempts again, and only then tries the second nameserver.
]]></description>
      <pubDate>Sun, 24 Mar 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/poormans-way-to-decent-dns-failover.html</guid>
      <content:encoded><![CDATA[Unfortunately the Linux DNS resolver has no direct support for detecting and doing failovers for DNS servers. It keeps feeding requests to your primary resolving nameserver, waits for a configured timeout, attempts again, and only then tries the second nameserver.

This typically means nearly 30s delay for all request as long as your primary nameserver is unreachable. It doesn't learn to directly target your secondary nameserver so long as there is trouble.

Even with the most optimal configuration, the delay will still be measured in seconds per request. For many requests, that's many more seconds.

I wanted to solve this.

<!--more-->

Mainly because over at [Transloadit](https://transloadit.com) our Amazon EC2 resolving nameserver (`172.16.0.23`) is unreachable too often.
When it happens it causes big delays and queues in some processes and even downtime as we rely on domain->ip translation. For instance customer could tell us to download 1000 images from different urls, watermark them, upload them to an sFTP server.

I wanted solid failover to Google / Level3's nameservers in case Amazon's went down again. And then failback as soon as possible, because Amazon can resolve `server33.transloadit.com` hostnames to LAN IPs where applicable, resulting in lower latency for instance-to-instance communication, when encoding machines need to work together.

<!-- I know that Spotify actually uses [DNS as a distributed datastore](https://labs.spotify.com/2013/02/25/in-praise-of-boring-technology/) for service discovery & configuration management.
 -->

But whatever the usecase, there's a need for better way to failover.

Ideally one that does not involve more local proxy daemons, external services, keepalived VRRP IPs, etc. as that would just introduce more complexity and Single Point Of Failures.
It should be transparent, archaic and at most rely on `crontab`.

So I wrote [nsfailover](https://github.com/kvz/nsfailover) in bash, and have it
replace the resolve-configuration when needed. It's rugged, easy to debug, hard to break, and has been working really well for us so far.

Running it is pretty simple, too. Configuration such as `NS_1`
is done via environment variables, here's an example where I set
it globally using export:

```bash
$ export NS_1=172.16.0.23
$ sudo nsfailover.sh
2013-03-27 14:18:22 UTC [     info] Best nameserver is primary (172.16.0.23)
2013-03-27 14:18:22 UTC [     info] No need to change /etc/resolv.conf
```

Or maybe you want also want to define the backup nameserver (defaults
to Google's), and just pass the config to this process:

```bash
$ NS_1=8.8.8.8 NS_2=8.8.4.4 sudo -E nsfailover.sh
2013-03-27 15:01:53 UTC [     info] Best nameserver is primary (8.8.8.8)

 # Written by /srv/current/stack/bin/nsfailover.sh @ 20130327150153
 nameserver 8.8.8.8
 options timeout:3 attempts:1
 search compute-1.internal

2013-03-27 15:01:53 UTC [emergency] I changed /etc/resolv.conf to primary (8.8.8.8)
```

Tight!

Now if you save this in [crontab with a timeout](/blog/2012/12/31/lock-your-cronjobs/):

```bash
$ crontab -e
# By default, NS_2 is Google, NS_3 is Level3, so only your NS_1 is required:
* * * * * timeout -s9 50s NS_1=172.16.0.23 nsfailover.sh 2>&1 |logger -t cron-nsfailover
```

it turns out, Bob indeed is your uncle :)

The `logger` pipe will make all output go to syslog so to get notified when a
failover happens, just scan for `[emergency]` in the `cron-nsfailover` tag.
In our case, Papertrail receives our syslog and I made it report to Campfire when this happens.

There's more documentation [available on Github](https://github.com/kvz/nsfailover).
Let me know if you have any improvements or send me a pull request :)
]]></content:encoded>
      <dc:date>2013-03-24T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Find Duplicate Input With MySQL</title>
      <link>https://kvz.io/find-duplicate-input-with-mysql.html</link>
      <description><![CDATA[Back to basic, let's brush up on some SQL :)
]]></description>
      <pubDate>Mon, 04 Mar 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/find-duplicate-input-with-mysql.html</guid>
      <content:encoded><![CDATA[Back to basic, let's brush up on some `SQL` :)

At my company we have employees creating customer accounts every day.
Sometimes we make mistakes, for instance, we forget to check if the
company already was a customer (maybe 10y ago they may have had a product).

Duplicate accounts can cause all sorts of problems, so I wanted
way to detect them with `SQL`.

The problem was, the company names may have been entered with
different punctuation, whitespace,
etc. So I needed similar names to surface from the depths of our database,
not just exact matches (that would have been too easy :)

<!--more-->

For the solution I turned to [SOUNDEX](https://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex)
for fuzzy matching similar sounding company names, and then
review the results myself (false positives are possible, but since they would be few,
it becomes a simple task to doublecheck) and report back to our company.

I thought I'd share

- partly because it could be useful to others
  (obviously this could be used to detect all kinds of user generated typos and similar
  entries);
- mostly because I'm curious to find if there is a better (more performant)
  way to write this query.

Do you know how? Leave a comment :)

```sql
-- Select all the individual company names that have a
-- soundex_code that occurs more than once (I now use a subquery for that)
SELECT
  `id`,
  `customer_name`,
  SOUNDEX(`customer_name`) AS soundex_code
FROM `customers`
WHERE SOUNDEX(`customer_name`) IN (
  -- Subquery: select all soundex_codes that occur more than once,
  -- (this does not return the individual company names that share them)
  SELECT SOUNDEX(`customer_name`) AS soundex_code
  FROM `customers`
  WHERE 1 = 1
    AND `is_active` = 1
    -- More specific criteria to define who you want to compare
  GROUP BY soundex_code
  HAVING COUNT(*) > 1
)
ORDER BY
  soundex_code,
  `customer_name`
```

This e.g. returns:

```bash
`id` `customer_name`  `soundex_code`
 291  F.S. Hosting     F2352
 1509 FS hosting       F2352
 9331 R  Schmit        R253
 9332 R Schmit         R253
```

By the way: The SQL is formatted according to my old [SQL Formatting blogpost](https://kvz.io/blog/2009/03/04/sql-formatting/).
Exactly 4 years after I published it, I still find it a pretty useful code convention.
]]></content:encoded>
      <dc:date>2013-03-04T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Introducing BASH3 Boilerplate</title>
      <link>https://kvz.io/introducing-bash3boilerplate.html</link>
      <description><![CDATA[
  This project now has its own homepage at bash3boilerplate.sh.

]]></description>
      <pubDate>Tue, 26 Feb 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/introducing-bash3boilerplate.html</guid>
      <content:encoded><![CDATA[> This project now has its own homepage at [bash3boilerplate.sh](https://bash3boilerplate.sh).

When hacking up BASH scripts, I often find there are some
higher level things like logging, configuration, command-line argument
parsing that:

 - I need every time
 - Take quite some effort to get right
 - Keep you from your actual work

Here's an attempt to bundle those things in a generalized way so that
they are reusable as-is in most of my (and hopefully your, if not ping
me) programs.

## Goals

Delete-key-friendly. I propose using [`main.sh`](https://github.com/kvz/bash3boilerplate/blob/master/main.sh) as a base and removing the
parts you don't need, rather than introducing a ton of packages, includes, compilers, etc.

Aiming for portability, I'm targeting Bash 3 (OSX still ships
with 3 for instance). If you're going to ask people to install
Bash 4 first, you might as well pick a more advanced language as a
dependency.

We're automatically testing bash3boilerplate and it's proven to work on:

- [Linux](https://travis-ci.org/kvz/bash3boilerplate/jobs/109804166#L91) `GNU bash, version 4.2.25(1)-release (x86_64-pc-linux-gnu)`
- [OSX](https://travis-ci.org/kvz/bash3boilerplate/jobs/109804167#L2453) `GNU bash, version 3.2.51(1)-release (x86_64-apple-darwin13)`

## Features

- Structure
- Safe defaults (break on error, pipefail, etc)
- Configuration by environment variables
- Configuration by command-line arguments (definitions parsed from help info,
  so no duplication needed)
- Magic variables like `__file` and `__dir`
- Logging that supports colors and is compatible with [Syslog Severity levels](https://en.wikipedia.org/wiki/Syslog#Severity_levels)

## Installation

There are 3 ways you can install (parts of) b3bp:

1. Just get the main template: `wget https://raw.githubusercontent.com/kvz/bash3boilerplate/master/main.sh`
2. Clone the entire project: `git clone git@github.com:kvz/bash3boilerplate.git`
3. As of `v1.0.3`, b3bp can be installed as a `package.json` dependency via: `npm install --save bash3boilerplate`

Although *3* introduces a node.js dependency, this does allow for easy version pinning and distribution in environments that already have this prerequisite. But nothing prevents you from just using `curl` and keep your project or build system low on external dependencies.

## Best practices

As of `v1.0.3`, b3bp adds some nice re-usable libraries in `./src`. Later on we'll be using snippets inside this directory to build custom packages. In order to make the snippets in `./src` more useful, we recommend these guidelines.

### Library exports

It's nice to have a bash package that can be used in the terminal and also be invoked as a command line function. To achieve this the exporting of your functionality *should* follow this pattern:

```bash
if [ "${BASH_SOURCE[0]}" != ${0} ]; then
  export -f my_script
else
  my_script "${@}"
  exit $?
fi
```

This allows a user to `source` your script or invoke as a script.

```bash
# Running as a script
$ ./my_script.sh some args --blah
# Sourcing the script
$ source my_script.sh
$ my_script some more args --blah
```

(taken from the [bpkg](https://raw.githubusercontent.com/bpkg/bpkg/master/README.md) project)

### Miscellaneous

- In functions, use `local` before every variable declaration
- This project settles on two spaces for tabs
]]></content:encoded>
      <dc:date>2013-02-26T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>OSX Productivity: Dropbox Your Screenshots</title>
      <link>https://kvz.io/osx-productivity-dropbox-your-screenshots.html</link>
      <description><![CDATA[I often share screens with co-workers by Campfire, Github, or mail.
Visualizing something can save you a lot of typing. Show people
what button shade doesn't look quite right, instead of explaining in 1000+ characters.
Share a load graph without saving &amp; attaching images, or handing out basic auth credentials.
The list goes on &amp; on. Once you make it a joy to share, you'll find use-cases on
a daily basis, and it is my believe you'll lose less time on typing and miscommunication.
]]></description>
      <pubDate>Mon, 18 Feb 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/osx-productivity-dropbox-your-screenshots.html</guid>
      <content:encoded><![CDATA[I often share screens with co-workers by Campfire, Github, or mail.
Visualizing something can save you a lot of typing. Show people
what button shade doesn't look quite right, instead of explaining in 1000+ characters.
Share a load graph without saving & attaching images, or handing out basic auth credentials.
The list goes on & on. Once you make it a joy to share, you'll find use-cases on
a daily basis, and it is my believe you'll lose less time on typing and miscommunication.

At least.. that's what I'm experiencing, using the following tricks.

<!--More-->

## The Normal Flow

- Press SHIFT + COMMAND + 4
- (optionally you spress SPACE to switch to Window-capturing)
- Drag an area that you want to share with co-workers
- A screenshot is saved to `~/Desktop`

And now you'll have to find your way to the screenshot. That seems like effort.

## The Better Flow

But having many windows open, now it's a pain to access the screenshot.

One thing you could do is make the Desktop folder more accessible by
adding it to your Dock. You can make it even better by setting it up as a
**Fan**, ordered by **Date Modfied**.

Now, your latest screenshot will always be the bottom-first item from your
new Dock fan. Now you can drag & drop it right into your open Github/Campfire
app like a pro, without ever minimizing the apps your already working in,
or firing up Finder windows to hunt down your screen.

## The Ultimate Flow

To add even more sugar to this, you could tell OSX to save screenshots into Dropbox
by default. This way you can use screens as quick notes, distributed across
all of your machines; or you could tell Dropbox to quickly make a public
link of your screen, and share a URL with anyone.

## Setup

Here's how to set up the Ultimate flow, assuming you have [Dropbox](https://db.tt/gkDsF6u).
Without Dropbox, it's still cool, just remove `/Dropbox` from the following commands.

```bash
$ mkdir -p ~/Dropbox/Screenshots/
$ defaults write com.apple.screencapture location ~/Dropbox/Screenshots/
$ killall SystemUIServer
```

Now open your Dropbox in the GUI

```bash
$ open ~/Dropbox/
```

- Drag the `Screenshots` next to your Downloads & Trash icons in your Dock
- Right click: Fan
- Right click: Date Modified

Tight! Now make some screenshots as explained in the **Normal flow** and
see them automatically added to the fan, most recent always the first
clickable item, and Dropbox syncing them up instantly.

## Undo

Don't like it? Wow you are one tough customer. But here we go :)

```bash
$ defaults write com.apple.screencapture location ~/Desktop/
$ killall SystemUIServer
```

And just remove the folder from your Dock. That's all :)

## Alternatives

Now the article already mentions Dropbox is very optional, but of course
you can also substitute Dropbox with (free) competitors such as [Sparkleshare](https://twitter.com/predominant/status/303538082773340160) if you're into that.

[It was also mentioned](https://twitter.com/predominant/status/303538211966300162)
by Graham Weldon that you can use
CloudApp for a similar workflow.
It's free & available in the [Mac App Store](https://itunes.apple.com/us/app/cloud/id417602904?mt=12&ls=1).
]]></content:encoded>
      <dc:date>2013-02-18T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Too Many Authentication Failures for Root</title>
      <link>https://kvz.io/too-many-authentication-failures-for-root.html</link>
      <description><![CDATA[I recently had an annoying encounter with the error message:
Too many authentication failures for root.
I found out this can be caused because you've hoarded too many SSH keys :)
]]></description>
      <pubDate>Wed, 13 Feb 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/too-many-authentication-failures-for-root.html</guid>
      <content:encoded><![CDATA[I recently had an annoying encounter with the error message:
`Too many authentication failures for root`.
I found out this can be caused because you've hoarded too many SSH keys :)

So serves me right, but let's see what happens exactly.

<!--more-->

Consider this paste:

```bash
$ ssh -vvv root@67.23.163.74 2>&1 |egrep '(public|fail)'
debug3: Could not load "/Users/kevin/.ssh/id_rsa" as a RSA1 public key
debug3: Could not load "/Users/kevin/.ssh/id_dsa" as a RSA1 public key
debug1: Authentications that can continue: publickey,password
debug3: start over, passed a different list publickey,password
debug3: preferred publickey,keyboard-interactive,password
debug3: authmethod_lookup publickey
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering DSA public key: /Users/kevin/.ssh/id_dsa
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,password
debug1: Offering RSA public key: /Users/kevin/.ssh/kevin_201011_true
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,password
debug1: Offering RSA public key: /Users/kevin/.ssh/transkey.pem
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,password
debug1: Offering RSA public key: /Users/kevin/.ssh/kevin_201211_true
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,password
debug1: Offering RSA public key: /Users/kevin/.ssh/kevin_201205_true
debug2: we sent a publickey packet, wait for reply
debug1: Authentications that can continue: publickey,password
debug1: Offering RSA public key: /Users/kevin/.ssh/id_rsa
debug2: we sent a publickey packet, wait for reply
Received disconnect from 67.23.163.74: 2: Too many authentication failures for root
```

It tries 6 different SSH keys it found laying around on my system before asking for a password.
I knew SSH automagiclly looks for the best way to authenticate you.
No problem there.

But the serverside counts each try as a failure!
Apparently SSHD was configured to only allow so many (6) failures for a user (root),
hence SSH never even bothered asked for the password that should be working for this box.

This server was freshly delivered by a colo provider, so I really could not get in any other way.
To fix, I could have cleaned up my keys, but instead I disabled login by Public key authentication:

```bash
$ ssh -o PubkeyAuthentication=no root@67.23.163.74
```

And this fixed the issue.

Afterwards I could set up proper public keys and disable password login altogether.

Tricky though, so I thought I'd share :)
]]></content:encoded>
      <dc:date>2013-02-13T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Keep Mounted Network Drives Alive on OSX</title>
      <link>https://kvz.io/macosx-persistent-mounted-network-drives.html</link>
      <description><![CDATA[I love my NAS but because I tried to save a little money it does not run SABnzbd very well.
]]></description>
      <pubDate>Mon, 11 Feb 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/macosx-persistent-mounted-network-drives.html</guid>
      <content:encoded><![CDATA[I love my NAS but because I tried to save a little money it does not [run SABnzbd](/blog/2011/02/28/optimize-your-synology-for-downloading/) very well.

I've tried different approaches but find myself ending up downloading on OSX as it writes to a network share on my NAS. Too bad, but I'm archiving this one under the section first world problems.

The challenge I have now though, is when my Mac goes to sleep, my mounts disappear, and SABnzbd writes to the local filesystem instead. Cause as far as my downloading program could tell, it was already writing to a local filesystem, so it will just keep on doing that until my Mac's disk is at 100%.

I wrote a little script to prevent that.

You may not be running SABnzbd, but there are obviously many other
use cases where you want a network mount to persist.
Especially if you are automating something outside of the GUI.

With some small adjustments this could work for Linux/NFS/SMB as well.

<!--more-->

Here we go

```bash
#!/usr/bin/env bash -e
# Monitors mounts, checks if they're writable, remounts if necessary.
# For OSX / AFP.
# If it had to remount, exits with code 1 so you can easily chain
# other scripts in such an event. e.g.:
#
#  NASPASS=******** ./remounter.sh || ./restart_downloader.sh restart
#

function _log () {
  local level="${1}"
  local str="${2}"
  echo "[$(date "+%Y-%m-%d %H:%M:%S")] ${level}: ${str}"
}
function info () {
  _log "INFO" "${1}"
}
function err  () {
  _log " ERR"  "${1}"
}
function crit () {
  _log "CRIT" "${1}";

  if [ "${REBOOTONCRIT}" = 1 ]; then
    echo "Warning, CRIT happened and reboot on crit was specified. "
    echo "Rebooting in 50 seconds"
    shutdown -r +50
  fi

  exit 1
}

[ -n "${NASSHARES}" ]    || NASSHARES="downloads video"
[ -n "${NASHOST}" ]      || NASHOST="nas.local"
[ -n "${NASUSER}" ]      || NASUSER="${USER}"
[ -n "${STAMPFILE}" ]    || STAMPFILE=".remounter.stamp"
[ -n "${NASPASS}" ]      || crit "Please set the shares password like so: NASPASS=******** ${0}"
[ -n "${FORCEREMOUNT}" ] || FORCEREMOUNT=0
[ -n "${REBOOTONCRIT}" ] || REBOOTONCRIT=0

function mount_exists () {
  local share="${1}"
  local dir="/Volumes/${share}"

  echo "$(mount |egrep -i "^//${NASUSER}@${NASHOST}/${share} on ${dir} " |wc -l)"
}

function mount_writable () {
  local share="${1}"
  local dir="/Volumes/${share}"

  local stamp="$(date "+%Y-%m-%d %H:%M:%S")"
  local stampfile="${dir}/${STAMPFILE}"

  echo "${stamp}" > "${stampfile}" || true

  local found="$(cat "${stampfile}")"

  [ -f "${stampfile}" ] && rm "${stampfile}"

  if [ "${found}" != "${stamp}" ]; then
    echo "0"
  else
    echo "1"
  fi
}

function remount () {
  local share="${1}"
  local dir="/Volumes/${share}"
  local stampfile="${dir}/${STAMPFILE}"

  info "remounting ${share}... "

  if [ -d "${dir}" ]; then
    # Try 3 times because this happened once:
    # //kevin@nas.local/downloads on /Volumes/downloads (afpfs, nodev, nosuid, mounted by kevin)
    # //kevin@nas.local/video on /Volumes/video (afpfs, nodev, nosuid, mounted by kevin)
    # //kevin@nas.local/downloads on /Volumes/downloads (afpfs, nodev, nosuid, mounted by kevin)
    # //kevin@nas.local/video on /Volumes/video (afpfs, nodev, nosuid, mounted by kevin)
    for i in `seq 1 3`; do
      umount -f "${dir}" || true
      sleep 1

      [ "$(mount_exists ${share})" -eq 0 ] && [ "$(mount_writable ${share})" -eq 0 ] && break
    done

    [ "$(mount_writable ${share})" -ne 0 ] && crit "Could not unmount ${share}. Still writable"
    [ "$(mount_exists ${share})" -ne 0 ] && crit "Could not unmount ${share}. Still exists"
  fi

  mkdir -p "${dir}"
  mount_afp -i "afp://${NASUSER}:${NASPASS}@${NASHOST}/${share}" "${dir}"
}

for share in ${NASSHARES}; do
  if [ "${FORCEREMOUNT}" = 1 ] || [ "$(mount_exists ${share})" -ne 1 ] || [ "$(mount_writable ${share})" -ne 1 ]; then
    [ -z "${missing}" ] || missing="${missing} "
    missing="${missing}${share}"
    remount "${share}"
  fi
done

if [ -z "${missing}" ]; then
  info "Mounts in tact. All good"
  exit 0
fi

info "Had to remount. "
exit 1
```

I [set up a cronjob](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/) so it
runs every minute like so:

```bash
$ crontab -e
* * * * * /location/remounter.sh || /location/restart_downloader.sh
```

To illustrate, here's `remounter.sh` in action. This happened while I was away,
apparently my Mac went to sleep (otherwise you'd see a stamp every minute),
and that totally broke the mounts afterwards, multiple times in different ways.

```bash
[2012-12-25 22:48:02] INFO: Mounts in tact. All good
[2012-12-25 22:49:01] INFO: Mounts in tact. All good
[2012-12-26 06:38:00] INFO: remounting downloads...
[2012-12-26 07:32:00] INFO: remounting downloads...
umount: /Volumes/downloads: not currently mounted
/blogscripts/remounter.sh: line 55: /Volumes/downloads/.remounter.stamp: Permission denied
cat: /Volumes/downloads/.remounter.stamp: No such file or directory
/blogscripts/remounter.sh: line 55: /Volumes/downloads/.remounter.stamp: Permission denied
cat: /Volumes/downloads/.remounter.stamp: No such file or directory
[2012-12-26 16:33:00] INFO: remounting downloads...
umount: /Volumes/downloads: not currently mounted
/blogscripts/remounter.sh: line 55: /Volumes/downloads/.remounter.stamp: Permission denied
cat: /Volumes/downloads/.remounter.stamp: No such file or directory
/blogscripts/remounter.sh: line 55: /Volumes/downloads/.remounter.stamp: Permission denied
cat: /Volumes/downloads/.remounter.stamp: No such file or directory
[2012-12-26 17:26:45] INFO: remounting video...
[2012-12-26 17:26:48] INFO: Had to remount.
[2012-12-26 17:27:04] INFO: Mounts in tact. All good
[2012-12-26 18:21:04] INFO: Mounts in tact. All good
mount_afp: AFPMountURL returned error -1069, errno is -1069
mount_afp: AFPMountURL returned error -5023, errno is -5023
[2012-12-27 01:45:02] INFO: Mounts in tact. All good
[2012-12-27 01:46:04] INFO: Mounts in tact. All good
[2012-12-27 01:47:06] INFO: Mounts in tact. All good
```

Would be a great idea to [solo](/blog/2012/12/31/lock-your-cronjobs) it so that you can
lock it to avoid that if a script runs for a longtime, a new script spawns and overlaps
the old one. That could lead to problems like high load / unexpected behavior.

Hope this helps :)
]]></content:encoded>
      <dc:date>2013-02-11T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Vagrant Tip: Sync VirtualBox Guest Additions</title>
      <link>https://kvz.io/vagrant-tip-keep-virtualbox-guest-additions-in-sync.html</link>
      <description><![CDATA[Quick tip. If you lose your Vagrant
mounts after kernel upgrades in your virtualbox,
you'll need to reinstall your VirtualBox Guest Additions.
Same is true when you upgrade Vagrant, etc.
]]></description>
      <pubDate>Wed, 16 Jan 2013 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/vagrant-tip-keep-virtualbox-guest-additions-in-sync.html</guid>
      <content:encoded><![CDATA[Quick tip. If you lose your [Vagrant](https://www.vagrantup.com/)
mounts after kernel upgrades in your virtualbox,
you'll need to reinstall your [VirtualBox Guest Additions](https://www.virtualbox.org/manual/ch04.html).
Same is true when you upgrade Vagrant, etc.

It's just a real pain and people usually avoid it by never upgrading.
Or delve in once they accidentally do.
But there's actually a nice & automated way of keeping your VM's guest additions in sync with
virtualbox.

<!--more-->

I was fed up with missing mountpoints and messages like these:

```bash
$ vagrant up
[default] Importing base box 'ubuntu-12.04-64bit'...
[default] The guest additions on this VM do not match the install version of
VirtualBox! This may cause things such as forwarded ports, shared
folders, and more to not work properly. If any of those things fail on
this machine, please update the guest additions and repackage the
box.

Guest Additions Version: 4.1.22
VirtualBox Version: 4.2.6
```

So, [Vbguest plugin](https://github.com/dotless-de/vagrant-vbguest) to the rescue!

To install, you type this one time in the directory that
also keeps your `Vagrantfile`:

```bash
$ # For vagrant < 1.1.5:
$ # vagrant gem install vagrant-vbguest

$ # For vagrant 1.1.5+ (thanks Lars Haugseth):
$ vagrant plugin install vagrant-vbguest
```

On succesful execution, the output should look something like this:

```bash
Fetching: micromachine-1.0.4.gem (100%)
Fetching: vagrant-vbguest-0.6.4.gem (100%)
Successfully installed micromachine-1.0.4
Successfully installed vagrant-vbguest-0.6.4
2 gems installed
Installing ri documentation for micromachine-1.0.4...
Installing ri documentation for vagrant-vbguest-0.6.4...
Installing RDoc documentation for micromachine-1.0.4...
Installing RDoc documentation for vagrant-vbguest-0.6.4...
```

And that's it. From now on every `vagrant up` will check & install
the correct guest additions right after booting:

```bash
[default] Waiting for VM to boot. This can take a few minutes.
[default] VM booted and ready for use!
[default] GuestAdditions 4.2.6 running --- OK.
```

And if it's outdated, here's what happens:

```bash
$ vagrant up
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.
[default] VM booted and ready for use!
[default] GuestAdditions versions on your host (4.2.10) and guest (4.2.6) do not match.
stdin: is not a tty
Reading package lists...
Building dependency tree...
Reading state information...
linux-headers-3.2.0-23-generic is already the newest version.
dkms is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 126 not upgraded.
[default] Copy iso file /Applications/VirtualBox.app/Contents/MacOS/VBoxGuestAdditions.iso into the box /tmp/VBoxGuestAdditions.iso
stdin: is not a tty
mount: warning: /mnt seems to be mounted read-only.
[default] Installing Virtualbox Guest Additions 4.2.10 - guest version is 4.2.6
stdin: is not a tty
Verifying archive integrity... All good.
Uncompressing VirtualBox 4.2.10 Guest Additions for Linux..........
VirtualBox Guest Additions installer
Removing installed version 4.2.6 of VirtualBox Guest Additions...
Removing existing VirtualBox DKMS kernel modules ...done.
Removing existing VirtualBox non-DKMS kernel modules ...done.
Building the VirtualBox Guest Additions kernel modules ...done.
Doing non-kernel setup of the Guest Additions ...done.
You should restart your guest to make sure the new modules are actually used
```

Tight! :)
]]></content:encoded>
      <dc:date>2013-01-16T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Lock Your Cronjobs, Enjoy Your Sleep</title>
      <link>https://kvz.io/lock-your-cronjobs.html</link>
      <description><![CDATA[If you use EC2 you may have heard of Tim Kay's aws commandline tool.
It provides access to most of Amazon's API and is less cumbersome
than Amazon's own CLI utilities in day to day use.
]]></description>
      <pubDate>Mon, 31 Dec 2012 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/lock-your-cronjobs.html</guid>
      <content:encoded><![CDATA[If you use EC2 you may have heard of Tim Kay's [aws](https://timkay.com/aws/) commandline tool.
It provides access to most of Amazon's API and is less cumbersome
than Amazon's own CLI utilities in day to day use.

A lesser known tool by Tim Kay is [solo](https://timkay.com/solo/). It's basically one line of Perl,
but it's incredibly useful to defeat a common problem with cronjobs: overlap.

<!--more-->

## The Problem

You've probably dealt with this before; you write a pretty neat `yourscript.sh`,
schedule it to run every or so minute on your production server.
One night, your server reaches a load of 90 and you get pagerdutied to fix this.

You login, which takes about 15 minutes, succeed executng `ps auxf`,
and it appears your server now has `8325` instances of `yourscript.sh` running.
What happened?!

Maybe there was an infinite loop in your script, maybe there were NFS timeouts,
you tried to update a database that had write-locks during backup,
but whatever the cause; there was overlap `8324` times, and this should never happen.
Not even once.

## The Solution

One way to defeat it, is to write perfect code and have 0 external dependecies that can increase
your script's execution time.

But since that is never going to happen ; ) I recommend taking a look at `solo`.

Tim Kay realised that operating systems typically can only ever have 1 process listening on a port,
and makes clever use of that as a locking mechanism.

## The Flow

- `/usr/bin/solo -port=3000 /path/to/yourscript.sh`
- Solo tries to open port 3000
- Can it open port 3000?
  - Start `yourscript.sh`
- Can't open port 3000?
  - Never mind, `yourscript.sh` is probably still running, will try again next time

Naturally this beats working with lock/PID files, because an open port is directly tied
to a runnin process, and chances of inconsistency and having to detect and cleanup
orphaned PID files, are zero.

## The Example

Your [crontab](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/) could e.g. look like this:

```bash
$ crontab -e
*    * * * * /usr/bin/solo -port=3001 /path/to/yourscript1.sh
*    * * * * /usr/bin/solo -port=3002 /path/to/yourscript2.sh
*/10 * * * * /usr/bin/solo -port=3003 /path/to/yourscript3.sh
```

You can now be sure that only one instance of each script will run at any given time.

Clever chaps may realise this can be used as a keepalive system for daemon-like
scripts. However I suggest looking into
[monit](https://mmonit.com/monit/) or
[upstart](https://kvz.io/blog/2009/12/15/run-nodejs-as-a-service-on-ubuntu-karmic/) for that.

## The Installation

This is what makes solo great, it has basically 0 dependencies
(ok Perl, but I'll assume you have that) and is a breeze to deploy.

```bash
$ sudo curl -q https://raw.github.com/timkay/solo/master/solo -o /usr/bin/solo \
  && sudo chmod a+x $_
```

Happy `crontab -e`ing, and happy dreams with few pagerduties, you've earned it :)

## The Alternative

As Jason mentioned, if your jobs don't need to necessarily finish running, but just restart without overlap, another option is to use [timeout](https://linux.die.net/man/1/timeout):

```bash
$ crontab -e
* * * * * timeout -s9 50s /path/to/yourscript1.sh
```

## The Next Level: Go Distributed With Cronlock

Ok so solo is the bomb in terms of simplicity for running 1 instance of 1 script on 1 server.

But what if you want to make sure only 1 instance of 1 script can run throughout many servers?
Install the cronjobs on just one server? Hm.. what if it goes down. That means
someone will have to intervene, chances are they will forget, and your nodes aren't really expendable.

Especially in volatile environments where nodes come & go as they please, you want cronjobs to be the
responsibility of the collective, not just 1 machine.

For this purpose I wrote [Cronlock](https://github.com/kvz/cronlock).

### The Good

- You can deploy all nodes equally, install all cronjobs on all servers, if a node goes down, another will make sure your jobs are executed

### The Bad

- It relies on a central redis server. If your cluster already relies on redis,
  you're not adding reliability or a [SPOF](https://en.wikipedia.org/wiki/Single_point_of_failure).
  If your cluster doesn't, reconsider using Cronlock.

### The Ugly

- I use straight up Bash for everything. I don't even use `redis-cli` to communicate with Redis.
  This is because I want deployment to be as easy as with solo. Just a

```bash
$ sudo curl -q https://raw.github.com/kvz/cronlock/master/cronlock -o /usr/bin/cronlock \
  && sudo chmod a+x $_
```

and you're set.

You can visit [Cronlock on Github](https://github.com/kvz/cronlock) for docs on how configure it.
]]></content:encoded>
      <dc:date>2012-12-31T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Installing Hubot on Ubuntu</title>
      <link>https://kvz.io/installing-hubot-on-ubuntu.html</link>
      <description><![CDATA[
]]></description>
      <pubDate>Tue, 20 Nov 2012 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/installing-hubot-on-ubuntu.html</guid>
      <content:encoded><![CDATA[<img src="/assets/images/posts/2012-11-20-installing-hubot-on-ubuntu-0.png" title="Hubot" alt="Hubot" style="float: left; margin-right: 10px;"/>

We used to run
[Hubot on Heroko](https://github.com/github/hubot/wiki/Deploying-Hubot-onto-Heroku) until
it crashed, not sure what happened exactly but we didn't bother bring it back due more pressing issues
within our company.

Then I saw one of the most georgeous presentations ever,
[Intergalactic Javascript Robots from Outer Space](https://speakerdeck.com/tanoku/intergalactic-javascript-robots-from-outer-space),
and it got me excited to run a [Hubot](https://hubot.github.com/) again.

<!--more-->

This time I wanted to [Deploy Hubot onto UNIX](https://github.com/github/hubot/wiki/Deploying-Hubot-onto-UNIX).
I followed the excellent article, and made some adjustments below in this blogpost to
accomodate better system integration and copypastability :)

## First Off

I'm assuming you use Campfire. Let's create a bot user by the name `hubot` there.
Make sure has a working email address because Campfire will want to validate it.

Hint: It's fun to give your `hubot@yourcompany.com` a
[nice avatar](/assets/images/posts/2012-11-20-installing-hubot-on-ubuntu-1.png)
at [gravatar.com](https://gravatar.com).

All done? Let's install!

## Install

```bash
# Prerequisites
aptitude install build-essential libssl-dev git-core redis-server libexpat1-dev logtail

# If you don't have node 0.6+
NODE_VERSION="v0.8.0"
NODE_BASE="node-${NODE_VERSION}"
pushd /usr/local
  wget https://nodejs.org/dist/${NODE_VERSION}/${NODE_BASE}.tar.gz
  tar xf ${NODE_BASE}.tar.gz -C src && cd src/${NODE_BASE}
  ./configure && make && make install
popd

# Coffeescript
npm install -g coffee-script

# Hubot
pushd /usr/local
  git clone git://github.com/github/hubot.git && cd hubot
  npm install
popd
```

Note that npm is bundled with node since [v0.6.3](https://blog.nodejs.org/2011/11/25/node-v0-6-3/).

### Hubot System User

```bash
$ adduser --disabled-password --gecos "" hubot
```

### Upstart Script

```bash
$ cat <<EOF > /etc/init/hubot.conf
description "Hubot Campfire bot"

# Campfire-specific environment variables. Change these:
env HUBOT_CAMPFIRE_ACCOUNT='companyname' # the one in: <companyname>.campfirenow.com
env HUBOT_CAMPFIRE_ROOMS='123456'
env HUBOT_CAMPFIRE_TOKEN='afafafafafafafafafafcdcdcdcdcdcdcdcdcdc'

# Subscribe to these upstart events
# This will make Hubot start on system boot
start on filesystem or runlevel [2345]
stop on runlevel [!2345]

# Path to Hubot installation
env HUBOT_DIR='/usr/local/hubot/'
env HUBOT='bin/hubot'
env ADAPTER='campfire'
env HUBOT_USER='hubot' # system account
env HUBOT_NAME='bot' # what hubot listens to
env PORT='5555'

# Keep the process alive, limit to 5 restarts in 60s
respawn
respawn limit 5 60

exec start-stop-daemon --start --chuid \${HUBOT_USER} --chdir \${HUBOT_DIR} \
  --exec \${HUBOT_DIR}\${HUBOT} -- --name \${HUBOT_NAME} --adapter \${ADAPTER}  >> /var/log/hubot.log 2>&1
EOF
```

Now don't forget to `$EDITOR /etc/init/hubot.conf` and change the 3 `HUBOT_CAMPFIRE_` environment variables.

To start, type:

```bash
$ start hubot
```

To check the log, type:

```bash
$ logtail /var/log/hubot.log
```

## Use

Now check Campfire to see if `Hubot entered the room`, and tell him to send you an image; type:

```bash
bot: image me donuts
```

Some other commands to get you started may be retrieved by issuing:

```bash
bot help
```

This will show you:

```bash
bot <user> is a badass guitarist - assign a role to a user
bot <user> is not a badass guitarist - remove a role from a user
bot animate me <query> - The same thing as `image me`, except adds a few parameters to try to return an animated GIF instead.
bot convert me <expression> to <units> - Convert expression to given units.
bot die - End hubot process
bot echo <text> - Reply back with <text>
bot help - Displays all of the help commands that Hubot knows about.
bot help <query> - Displays all help commands that match <query>.
bot image me <query> - The Original. Queries Google Images for <query> and returns a random top result.
bot map me <query> - Returns a map view of the area returned by `query`.
bot math me <expression> - Calculate the given expression.
bot mustache me <query> - Searches Google Images for the specified query and mustaches it.
bot mustache me <url> - Adds a mustache to the specified URL.
bot ping - Reply with pong
bot pug bomb N - get N pugs
bot pug me - Receive a pug
bot show storage - Display the contents that are persisted in the brain
bot show users - Display all users that hubot knows about
bot the rules - Make sure hubot still knows the rules.
bot time - Reply with current time
bot translate me <phrase> - Searches for a translation for the <phrase> and then prints that bad boy out.
bot translate me from <source> into <target> <phrase> - Translates <phrase> from <source> into <target>. Both <source> and <target> are optional
bot who is <user> - see what roles a user has
bot youtube me <query> - Searches YouTube for the query and returns the video embed link.
```

## Going Forward

You can write your own [Hubot Scripts](https://github.com/github/hubot-scripts#readme), or use
[those by others](https://hubot-script-catalog.herokuapp.com/).

Happy hubbing :)
]]></content:encoded>
      <dc:date>2012-11-20T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Highlevel Testing With CasperJS</title>
      <link>https://kvz.io/highlevel-testing-with-casperjs.html</link>
      <description><![CDATA[If you've written a webapp and you want to ensure that critical parts such as the signup process stay working, the best would be to have an actual user go through that process every time you change your codebase. But since that's is both tedious &amp; expensive, the second best thing is to automate a chrome browser (webkit engine anyway) to do this for you, and upload screenshots if anything unexpected happens.
]]></description>
      <pubDate>Sat, 03 Nov 2012 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/highlevel-testing-with-casperjs.html</guid>
      <content:encoded><![CDATA[If you've written a webapp and you want to ensure that critical parts such as the signup process stay working, the best would be to have an actual user go through that process every time you change your codebase. But since that's is both tedious & expensive, the second best thing is to automate a chrome browser (webkit engine anyway) to do this for you, and upload screenshots if anything unexpected happens.

Welcome to CasperJS!

<!--more-->

From the [CasperJS](https://casperjs.org/) website:

> "Casperjs is an open source navigation scripting & testing utility written in Javascript and based on [PhantomJS](https://www.phantomjs.org/) — the scriptable headless WebKit engine. It eases the process of defining a full navigation scenario and provides useful high-level functions, methods & syntactic sugar".

## Install

### OSX

If you use [Homebrew](https://mxcl.github.com/homebrew/), you can install both CasperJS and PhantomJS using this command:

```bash
$ brew install casperjs
```

### Ubuntu

To install both PhantomJS and CasperJS into `/usr/local` with this layout:

```bash
/usr/local/bin/casperjs  -> /usr/local/casperjs-1.0.0/bin/casperjs
/usr/local/bin/phantomjs -> /usr/local/phantomjs-1.8.0-linux-x86_64/bin/phantomjs
```

Please execute the following
(Warning: purges all previous versions of CasperJS inside /usr/local/)

```bash
PHANTOM_VERSION="1.8.0"
PHANTOM_BASE="phantomjs-${PHANTOM_VERSION}-linux-x86_64"
if [ "$(/usr/local/bin/phantomjs --version 2>/dev/null)" != "${PHANTOM_VERSION}" ]; then
  DEBIAN_FRONTEND=noninteractive apt-get -y --force-yes install libfontconfig1-dev
  pushd /usr/local/
    wget --quiet https://phantomjs.googlecode.com/files/${PHANTOM_BASE}.tar.bz2
    tar -jxvf ${PHANTOM_BASE}.tar.bz2 && rm ${PHANTOM_BASE}.tar.bz2*
    ln -nfs /usr/local/${PHANTOM_BASE}/bin/phantomjs /usr/local/bin/phantomjs
  popd
fi

CASPER_VERSION="1.0.0"
CASPER_BASE="casperjs-${CASPER_VERSION}"
if [ "$(/usr/local/bin/casperjs --version 2>/dev/null)" != "${CASPER_VERSION}" ]; then
  pushd /usr/local/
    rm -rf *casperjs*
    wget --quiet https://github.com/n1k0/casperjs/tarball/${CASPER_VERSION}
    tar -zxvf ${CASPER_VERSION} && rm ${CASPER_VERSION}*
    mv *casperjs* /usr/local/${CASPER_BASE} # n1k0-casperjs-e629586
    ln -nfs /usr/local/${CASPER_BASE}/bin/casperjs /usr/local/bin/casperjs
  popd
fi
```

```bash
$ phantomjs --version
$ casperjs --version
```

## Use

### Notes

You can write CaspjerJS scripts in Javascript or [Coffeescript](https://coffeescript.org/).
CasperJS will just switch interpreters based on the extension of the script you feed it.
For small projects like these I prefer Coffeescript.

Note that CasperJS is not [node.js](https://nodejs.org/) and though compatible with require, you
[cannot use any npm modules](https://github.com/n1k0/casperjs/issues/247).

To check if your `.coffee` files are valid, I recommend running them through
[coffeelint](https://www.coffeelint.org/) (`npm install -g coffeelint`).

Make sure you [run at least RC3](https://github.com/n1k0/casperjs/issues/291#issuecomment-10994973)
if you want to capture screenshots of timeouts as well.

### Example

Here's an example script that shows some different tricks, I've commented
along the way. Some gems:

- Anytime a testcase fails, a `.png` is saved. CasperJS will exit with code `1`
  so it's really easy to detect a fail then upload this screenshot to [campfire](https://campfirenow.com/)
  for example. This is possible using just `curl` and your campfire api keys.
- Anytime a page contains: `Error` or `Exception`, a fail is automatically
  triggered without the need to write additional asserts for this. It can be
  disabled on a URL basis (in this case `/nonexistent` is
  allowed to throw these texts).

```coffeescript
## Setup
##########################################################################

utils  = require("utils")
casper = require("casper").create
  verbose: true
  logLevel: "warning"
  exitOnError: true
  safeLogs: true
  viewportSize:
    width: 1024
    height: 768

testhost   = casper.cli.get "testhost"
screenshot = casper.cli.get "screenfile"

casper
  .log("Using testhost: #{testhost}", "info")
  .log("Using screenshot: #{screenshot}", "info")

if not testhost or not screenshot or not /\.(png)$/i.test screenshot
  casper
    .echo("Usage: $ casperjs test project.coffee --ignore-ssl-errors=yes --testhost=<testhost> --screenfile=<screenshot.png>")
    .exit(1)

## Hooks
##########################################################################

# Capture screens from all fails
casper.test.on "fail", (failure) ->
  casper.capture(screenshot)
  casper.exit 1

# Capture screens from timeouts from e.g. @waitUntilVisible
# Requires RC3 or higher.
casper.options.onWaitTimeout = ->
  @capture(screenshot)
  @exit 1

# Scan for the word notice|warning|error|exception by default
casper.on "step.complete", (page) ->
  # Skip urls that can contain 'error'/'exception'
  u = casper.getCurrentUrl()
  if (u == "https://#{testhost}/nonexistent")
    return

  @test.assertEval ->
    !$('div#content').text().match(/(notice|warning|error|exception)/i)
  , "no notices, warnings, errors or exceptions in #{u}"

## Testcases
##########################################################################

# This is an app that has everything (even the /news page) behind a login.

# try to access nonexistent when logged in (don't 404, we only tell customers what exists and what not)
casper.start "https://#{testhost}/nonexistent", ->
  @test.assertHttpStatus(302, "nonexistent should 302 when logged not in (can't show guests what exists)")
  @test.assertUrlMatch /\/customers\/login/, "redirect to login"

# open /news/ without login, errors out, should go to login,
casper.thenOpen "https://#{testhost}/news/", ->
  @test.assertTextExists "I could not give you access to", "cannot access news without login"
  @test.assertUrlMatch /\/customers\/login/, "redirect to login"
  @test.assertTitle "Please login", "login page title is the one expected"
  @test.assertExists "form[action=\"/customers/login/\"]", "login page must have a form with customer/login action"
  @fill "form[action=\"/customers/login/\"]", { "data[Customer][username]": "janedoe", "data[Customer][password]": "jsdi32ld!" }, true

# redirect to landing page /news/
casper.then ->
  @test.assertUrlMatch /\/news/, "redirected to landing page after login"

# notice login twice
casper.thenOpen "https://#{testhost}/customers/login", ->
  @test.assertTextExists "You are already logged in", "notice already logged in"

# try to access admin page
casper.thenOpen "https://#{testhost}/admin/tickets", ->
  @test.assertTextExists "I could not give you access to ", "prohibit to access admin page"

# try to access nonexistent when logged in
casper.thenOpen "https://#{testhost}/nonexistent", ->
  @test.assertHttpStatus 404, "nonexistent should 404 when logged in"

# dashboard has panels
casper.thenOpen "https://#{testhost}/customers/dashboard", ->
  @test.assertTitle "Dashboard", "customer dashboard title is ok"
  @test.assertEvalEquals ->
    __utils__.findAll(".user-dashboard div.accordion-heading").length
  , 8, "found 8 customer dashboard panels"

# calculate storage price
casper.thenOpen "https://#{testhost}/storage_accounts/add", ->
  @evaluate ->
    $("#StorageAccountBytesMax").val("10737418240")
    $("#StorageAccountPassword").val("dlfksfag!1")
    $("#StorageAccountEmail").val("janedoe@example.com")
    $("#StorageAccountBytesMax").change()

  @waitFor ->
    @evaluate ->
      $("#billabe_buy").text() != "Calculating..."
  , ->
    @test.assertSelectorHasText "#billabe_buy", "45.00", "10gb is 45.00 euros for janedoe"

# unowned invoice: prohibit
casper.thenOpen "https://#{testhost}/invoices/view/201100493", ->
  @test.assertTextExists "Invoice not found", "prohibit access to invoice of another customer"

# owned invoice: allow and check it's price is 12 cents
casper.thenOpen "https://#{testhost}/invoices/view/201100975", ->
  @test.assertTextExists "Subtotal", "my invoice has a subtotal"
  @test.assertEval ->
    $("td.total").text().indexOf("0.12") > -1
  , "invoice 201100975 total is 12 cents"


## Bombs away
##########################################################################

casper.run ->
  @test.renderResults true
```

See what we did? In just ~100 LoC we make sure this app deals correct prices for
new products, protects people's invoices and admin pages from unauthorized access, makes sure
the login system functions & redirects correctly, and that no page except `/nonexistent`
has any errors on it.
If any of these conditions aren't met, a screenshot is made.

### Run

To run it, type something like:

```bash
$ casperjs \
  test \
  ./tests/project.coffee \
  --ignore-ssl-errors=yes \
  --testhost=staging.exampleproject.com \
  --screenfile=./webroot/fails/screenshot.png # || script to upload screenshot.png to campfire.
```

Ideally, you'd wrap this up in a script and plug it into your
[Continuous Integration](https://en.wikipedia.org/wiki/Continuous_integration) server
so that it gets run on every change.

**Alternatively**

While still developing, it's really pleasant to have your Mac open the screenshot
automatically after any fail:

```bash
$ rm -f ~/Desktop/screen.png \
 ; casperjs test ./tests/main.coffee --ignore-ssl-errors=true --testhost=www.example.local --screenfile=~/Desktop/screen.png \
|| open ~/Desktop/screen.png
```

## Conclusion

There are also paid services you can outsource this to. Most of them offer a lot more
features such as also testing against FF, IE, Opera, Mobile, etc. so it may
make sense for you to use one of those. Some I know in no particular order:

- [Sauce Labs](https://saucelabs.com/)
- [Browserling](https://browserling.com/)
- [BrowserStack](https://www.browserstack.com/)

As for some advantages of rolling this out yourself:

- customize to your needs, run on your own CI server
- the tests & actual code are stored in the same repository, hack on your code, hack on your tests, it's all versioned and coupled, this makes it easy and fun to update your tests.
- no monthly fees
- and as you've noticed it's actually not hard to do anymore, thanks to PhantomJS & CasperJS
]]></content:encoded>
      <dc:date>2012-11-03T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Reverse a Multibyte String in PHP</title>
      <link>https://kvz.io/reverse-a-multibyte-string-in-php.html</link>
      <description><![CDATA[PHP's strrev
is not safe to use on utf-8 strings because it reverses a string
one byte at a time. So if a character consists of multiple bytes it cannot be preserved
as an entity in the reversed result.
]]></description>
      <pubDate>Tue, 09 Oct 2012 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/reverse-a-multibyte-string-in-php.html</guid>
      <content:encoded><![CDATA[PHP's [strrev](https://php.net/manual/en/function.strrev.php)
is not safe to use on utf-8 strings because it reverses a string
one byte at a time. So if a character consists of multiple bytes it cannot be preserved
as an entity in the reversed result.

There is no [Multibyte String](https://php.net/manual/en/book.mbstring.php) alternative
to `strrev` either.

<!--more-->

We did some googling, but strangely enough all solutions we encountered were
either invalid or incredibly heavy memory/code wise.

For example:

- [using utf8\_decode](https://stackoverflow.com/a/4919626/151666) only works if your characters in the string exist in the ISO-8859-1 character set
- [using preg\_match\_all](https://php.net/manual/en/function.strrev.php#83461) seems weirdly over-engineered
- [a simpler preg\_match\_all](https://php.net/manual/en/function.strrev.php#62422) works, but on a 2MB string PHP was already using 150MB of memory. This is actually what sparked our search when
  when [@renan\_saddam](https://twitter.com/renan_saddam) noticed his
  [PHP port](https://github.com/renansaddam/email_reply_parser/blob/9c2610fbec87211591701ec322723129ae4a1768/library/EmailReplyParser/Fragment.php#L82) of Github's [email\_reply\_parser](https://github.com/github/email_reply_parser) choked on a 2MB multibyte email.

## What We Came Up With

Is dead simple, but I'm putting it online anyway since it's apparently not common good.

```php
<?php
function mb_strrev ($string, $encoding = null) {
	if ($encoding === null) {
		$encoding = mb_detect_encoding($string);
	}

	$length   = mb_strlen($string, $encoding);
	$reversed = '';
	while ($length-- > 0) {
		$reversed .= mb_substr($string, $length, 1, $encoding);
	}

	return $reversed;
}
?>
```

Example:

```php
<?php
echo    strrev('Gonçalves') . "\n"; // returns sevla??noG
echo mb_strrev('Gonçalves') . "\n"; // returns sevlaçnoG
?>
```

In our tests, the above function was **factor 5x** more efficient in regards to memory consumption than the `preg_match_all` solution.

Hope this helps
]]></content:encoded>
      <dc:date>2012-10-09T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Quick Server Debugging With WTF</title>
      <link>https://kvz.io/quick-server-debugging-with-wtf.html</link>
      <description><![CDATA[If something weird is happening, you want to know everything that's going on
on a server, as fast as possible.
]]></description>
      <pubDate>Wed, 03 Oct 2012 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/quick-server-debugging-with-wtf.html</guid>
      <content:encoded><![CDATA[If something weird is happening, you want to know everything that's going on
on a server, as fast as possible.

At these times, you will be very happy to have a simple alias `wtf` installed
that you can type immediately after logging into a server, and see
all that it's busy with.

<!--more-->

## tldr; oneliner (okay, three-liner)

```bash
$ [ -z "$(fgrep 'alias wtf' ~/.bash_aliases)" ] \
  && echo "alias wtf='tail -f /var/log/{dmesg,messages,*{,/*}{log,err}}'" \
  >> ~/.bash_aliases && . ~/.bash_aliases
```

## Explained

Normally you could type

```bash
$ tail -f /var/log/messages
$ tail -f /var/log/syslog
$ tail -f /var/log/mysql/error.log
$ tail -f /var/log/nginx/php-errors.log
$ # etc
```

But you can have all this goodness in one command:

```bash
$ tail -f /var/log/{dmesg,messages,*{,/*}{log,err}}
```

Now this is slightly hard to type, so we'll save it in an alias:

```bash
$ alias wtf='[...]'
```

Cool. You can now just type `wtf`.

However next time you login, the alias is lost. To make it persist, we store it
in your `~/.bash_aliases`. This file gets sourced everytime you log into bash
(if you use zsh, you will know what to do):

```bash
$ echo "[...]" >> ~/.bash_aliases
```

But this only stores the alias for your next logins. To make your current session
profit, we source it now:

```bash
$ . ~/.bash_aliases
```

One more thing, you should just be able to copy paste this, without the alias getting
added twice, so we make sure it does not exist yet:

```bash
$ [ -z "$(fgrep 'alias wtf' ~/.bash_aliases)" ]
```

(directly using `fgrep` here could cause an `exit 1` if `~/.bash_aliases`
does not exist yet, in which case we still want to continue, so we check with `-z` instead)

And we put all this together as shown above in the oneliner, and we can start using `wtf` directly.

## On a Mac?

Add the following to your `~/.bash_profile`

```bash
if [ -f ~/.bash_aliases ]; then
. ~/.bash_aliases
fi
```

Courtesy of [@dogmatic and myself](https://twitter.com/dogmatic69/statuses/78102198432710656)

:)
]]></content:encoded>
      <dc:date>2012-10-03T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Blog With Octopress and Github Pages</title>
      <link>https://kvz.io/blog-with-octopress.html</link>
      <description><![CDATA[This article aims to provide a compact tutorial for setting up an Octopress blog
from scratch on OSX.
]]></description>
      <pubDate>Tue, 25 Sep 2012 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/blog-with-octopress.html</guid>
      <content:encoded><![CDATA[This article aims to provide a compact tutorial for setting up an Octopress blog
from scratch on OSX.

Since so many people encounter Mountain Lion breakage, the article starts with
a few pointers how to get Homebrew, Git, make, Ruby going first.

<!--more-->

Please look closely at the steps if you really want to do each of them.
For instance I replace `rvm` with `rbenv`, you may want to skip that.

- Replace `kvz.io` with your domain name everywhere.
- Replace `kvz` with your Github username everywhere.

## Upgraded to Mountain Lion?

Make sure you can build something again

- Upgrade Xcode
- Launch Xcode
- Select Xcode -> Preferences from the menu bar.
- Select the Downloads tab.
- Install "Command Line Tools".

"Boots into OS X, and my terminal takes FOREVER to boot.
Fixing permissions on the /usr/local directory fixed most of that"

```bash
$ sudo chown -R `whoami` /usr/local
```

### Fixing Homebrew

```bash
$ brew update
$ brew outdated|xargs brew install
$ brew tap homebrew/dupes
$ brew install apple-gcc42 git
$ brew upgrade
```

### Switch from rvm to rbenv

Remove rvm

```bash
$ rvm implode
```

Install rbenv instead, and Ruby 1.9.3-p194 which is required for Octopress

```bash
$ brew install rbenv
$ brew install ruby-build
$ eval "$(rbenv init -)"
$ rbenv install 1.9.3-p194
$ rbenv global 1.9.3-p194
```

### Environment

This should be in your `~/.bash_profile`, `~/.zshrc` or `~/.zprofile`

```bash
$ export PATH="$HOME/.rbenv/bin:$PATH"
$ eval "$(rbenv init -)"
$ # required for https://github.com/imathis/octopress/issues/144
$ export LC_CTYPE=en_US.UTF-8
$ export LANG=en_US.UTF-8
```

Source it now so you won't have to open another tab first

```bash
$ source ~/.zprofile || source ~/.zshrc || source ~/.bash_profile
```

Make sure you have no other Ruby versions in your `$PATH`

```bash
$ echo "# ${PATH}"
$ # /Users/kevin/.rbenv/shims:/Users/kevin/.rbenv/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
```

e.g., mine had a legacy `/Users/kevin/.gem/ruby/1.8/bin` in it, causing Jekyll to
segfault.

## Installing Octopress as Your Blog

```bash
$ git clone git://github.com/imathis/octopress.git kvz.io
$ cd kvz.io
$ ruby --version # should read ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin12.2.0]
$ gem install bundler
$ rbenv rehash
$ bundle install
```

### Deploying to Github project pages (<https://username.github.com/project[gh-pages]>)

Let octopress know about this

```bash
$ rake setup_github_pages
$ # Repository url: git@github.com:kvz/kvz.io.git
$ rake install
$ rake generate && rake deploy
```

Note: Different from Github use pages (<https://username.github.com/username.github.com[master]>).

Setup a domain

```bash
$ echo 'kvz.io' > source/CNAME
```

Put these DNS records in place

```bash
A     record from kvz.io     to 204.232.175.78  (Cannot be a CNAME!)
CNAME record from www.kvz.io to kvz.github.com
```

### Makefile

So you'll only have to type `make blog` to push online ALL THE THINGS!

Edit `./Makefile`

```bash
all: tweets dependencies blog

setup:
	bundle exec rake setup_github_pages\[git@github.com:kvz/kvz.io.git\]

unpub:
	$EDITOR .
	fgrep -rIi "published: false" ./source/_posts | awk -F: '{print "$EDITOR " $$1}' |bash

tweets:
	bundle exec rake twitter
	make blog MSG="updated twitter"

comments:
	rake build_comments

dependencies:
	(cd ~/workspace/kvz.io 2>/dev/null || (cd ~/workspace && git clone git@github.com:kvz/kvz.io)) && git pull
	(cd ~/workspace/bash3boilerplate 2>/dev/null || (cd ~/workspace && git clone git@github.com:kvz/bash3boilerplate.git)) && git pull
	(cd ~/workspace/dotfiles 2>/dev/null || (cd ~/workspace && git clone git@github.com:kvz/dotfiles.git)) && git pull
	(cd ~/workspace/transloadit-api2 2>/dev/null || (cd ~/workspace && git clone git@github.com:transloadit/transloadit-api2.git)) && git pull
	(cd ~/workspace/nsfailover 2>/dev/null || (cd ~/workspace && git clone git@github.com:kvz/nsfailover)) && git pull
	(cd ~/workspace/logstreamer 2>/dev/null || (cd ~/workspace && git clone git@github.com:kvz/logstreamer)) && git pull

preview:
	bundle exec rake build && bundle exec rake generate && bundle exec rake preview

blog:
	git pull && \
	bundle install && \
	bundle exec rake integrate && \
	bundle exec rake build && \
	bundle exec rake generate && \
	bundle exec rake deploy && \
	git add .; \
	git commit -am "blog update $$(date +%Y-%m-%d)"; \
	git push origin master

.PHONY: blog%
```

### Don't forget to commit the source for your blog.

To add your project's master as the new origin. This is what
my  `.git/config` looks like:

```bash
[core]
  repositoryformatversion = 0
  filemode = true
  bare = false
  logallrefupdates = true
  ignorecase = true
[remote "origin"]
  fetch = +refs/heads/*:refs/remotes/origin/*
  url = git@github.com:kvz/kvz.io.git
[remote "octopress"]
  url = git://github.com/imathis/octopress.git
  fetch = +refs/heads/*:refs/remotes/octopress/*
[branch "master"]
  remote = origin
  merge = refs/heads/master
```

This way you can pull from octopress/master to receive updates, and
push to origin/master to save your posts.

## Start Blogging

New article

```bash
$ rake new_post\["Blog with Octopress"\]
$ $EDITOR source/_posts/$(date +%Y-%m-%d)-blog-with-octopress.md
```

More

- [configuration](https://octopress.org/docs/configuring/)
- [basics](https://octopress.org/docs/blogging/)
- [code](https://octopress.org/docs/blogging/code/)
- [plugins](https://octopress.org/docs/blogging/plugins/)
- [3rd party plugins](https://github.com/imathis/octopress/wiki/3rd-party-plugins)

### Add an About Page

Add & edit like so

```bash
$ rake new_page\[about\]
$ $EDITOR source/about/index.md
```

Add a link to it in the navigation

```bash
$ $EDITOR source/_includes/custom/navigation.html
```

### Add an About Panel

```bash
$EDITOR source/_includes/custom/asides/about.html
```

and now add it to your `_config.yml` so it will be included on every page:

```yaml
default_asides:
- custom/asides/about.html
```

More [info](https://octopress.org/docs/blogging/)

### Bombs Away :)

```bash
$ make blog MSG="Updated blog"
```

## Thanks

- <https://josediazgonzalez.com/2012/07/25/upgrading-from-lion-to-mountain-lion/>
- <https://octopress.org/docs/>
- <https://www.moncefbelyamani.com/how-to-install-and-configure-octopress-on-a-mac/>
- <https://help.github.com/articles/user-organization-and-project-pages>
- <https://github.com/Shopify/liquid/wiki>
]]></content:encoded>
      <dc:date>2012-09-25T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Revisiting Faster PHP Sessions</title>
      <link>https://kvz.io/faster-php-sessions.html</link>
      <description><![CDATA[
  "Simplicity is prerequisite for reliability."

]]></description>
      <pubDate>Fri, 29 Apr 2011 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/faster-php-sessions.html</guid>
      <content:encoded><![CDATA[>  "Simplicity is prerequisite for reliability."

[Edsger W. Dijkstra](https://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD498.html)

As our experience grows, we learn from past mistakes and discover what's truely important in reliable systems.
When designing systems, simplicity is an often heard mantra, but it isn't getting applied nearly as much as spoken off. I'm guilty of this too. I think it's mainly because engineers love to, well, engineer :) and will naturally try to [outsmart problems by throwing more tech at it](https://teddziuba.com/2010/12/the-3-basic-tools-of-systems-engineering.html).

<!--more-->

## Article vs Article

In the light of this, I revisit my 2008 article [Enhance PHP session management](/blog/2008/06/22/enhance-php-session-management/).
The article explains how you can use a central memcache server to store sessions for performance & scalability purposes.

Having a shared something when you can avoid it is asking for problems, and I was just throwing unneeded tech at this: network protocols, pecl modules, configuration. All vulnerable to [bugs](https://pecl.php.net/bugs/search.php?cmd=display&package-name[]=memcache&status=All&search-for=&php-os=&boolean=0&author-email=&bug-type=&bug-age=0&bug-updated=0&order-by=id&direction=ASC&phpver=&limit=300&handle=&assign=&maintain=&begin=0), maintenance, performance penalties and outage.

Using 2007 article [Create turbocharged storage using tmpfs](/blog/2007/07/18/create-turbocharged-storage-using-tmpfs/), we can
defeat some of this over-engineering and take a simpler approach to speeding up sessions in PHP.
We'll store them decentralized in memory by mounting RAM onto the existing `/var/lib/php5` session directories throughout your application servers, which I will call nodes from now on.

## Make Session Dir Live in RAM

Add this to your `/etc/fstab`:

```bash
$ # Make PHP Sessions live in RAM
$ tmpfs /var/lib/php5 tmpfs size=300M,atime 0 0
```

This will make sure the 300MB RAM device will be available on your next reboot as well.

300MB is a lot.

You can decrease it later on by changing the `/etc/fstab` entry and

executing `mount -o remount /var/lib/php5`

## Activate & Migrate Existing Sessions

Then execute:

```bash
$ # Create a temporary place for current sessions
$ mkdir -p /tmp/phpsessions/

$ # Move current sessions to it
$ mv /var/lib/php5/* /tmp/phpsessions/

$ # Activate our ramdisk
$ mount -a

$ # Move the current sessions back
$ mv /tmp/phpsessions/* /var/lib/php5/

$ # Remove the temporary placeholder
$ rmdir /tmp/phpsessions
```

## Advantages

What's nice about saving sessions in a [tmpfs](/categories/tmpfs/) device compared with saving in memcache is:

- you can migrate to this solution without logging people out :)
- nothing needs to be installed
- instead of throwing errors, it degrades gracefully as disk storage if implementation fails
- you can restart/flush/upgrade any existing memcache instances without people losing sessions
- it uses the default `/var/lib/php5` directory, so no `.ini` changes, and PHP's garbage collector will still purge old sessions
- it takes away a bottleneck & single point of failure in your architecture
- it's just a mountpoint, so existing monitoring tools will automatically trigger alerts when you need to allocate more space
- no locking issues with ajax calls (though I believe fixed in memcached-3.0.4beta)
- no protocol overhead
- less tech, so less prone to errors & bugs, easier upgrade process

## Decentralizing

Now this doesn't work in clusters without Sticky Sessions.
But you've got to ask yourself: in huge clusters, do you really want Shared Sessions? The bigger the cluster, the more vulerable you'll become as
it really only adds a bottle-neck & single point of failure to your architecture.

With decent loadbalancers like EC2's ELB, Pound, HAProxy it becomes childsplay to implement Sticky Sessions so that people keep ending up on the node that has their session.

When you're [designing to tolerate failure](https://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html), this architecture
is much more robust than depending on anything shared.

Yes, some people will be logged out when you shut down a node (vs *all* when your session store goes down).

To counter you could:

- drain a node's connections before you take it into planned maintenance, this way nobody is affected
- [rsync](/blog/2007/08/16/synchronize-files-with-rsync/) sessions between nodes if it's crucial that all sessions survive outage.

This could even be automated where nodes can cover for eachother.
If it's worth the investment depends on your application. Are your nodes likely to go down completely? How many customers will get logged out? What kind of data is lost?

Even if your session store is clustered and uses persistent storage like
[Redis](/blog/2010/03/25/redis-in-php/) or
[MySQL](/categories/mysql/)
(not the right tool for the job people): network outage, maintenance and misconfiguration can hurt you badly, logging out all customers or worse, throwing errors throughout your platform.

Problems will be bigger and harder to solve.

Whereas if the RAM mountpoint fails, `/var/lib/php5` just degrades gracefully as normal disk-based storage. Making sessions slower on that 1 node, but at you'll still be serving customers.

I welcome your thoughts on this!
]]></content:encoded>
      <dc:date>2011-04-29T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Revisiting Spaces and Tabs</title>
      <link>https://kvz.io/spaces-vs-tabs.html</link>
      <description><![CDATA[This article in 50 words: I used to prefer spaces vs tabs, now I don't care so much, think it's
more important that you can easily switch on a per-project basis. Have some thoughts on how conventions
should be established, and I'll demonstrate bash code that can convert your codebase to a new standard.
]]></description>
      <pubDate>Thu, 31 Mar 2011 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/spaces-vs-tabs.html</guid>
      <content:encoded><![CDATA[This article in 50 words: I used to prefer spaces vs tabs, now I don't care so much, think it's
more important that you can easily switch on a per-project basis. Have some thoughts on how conventions
should be established, and I'll demonstrate bash code that can convert your codebase to a new standard.

<!--more-->

## Back in the Day

I used to prefer [spaces over tabs](https://www.jwz.org/doc/tabs-vs-spaces.html) so
code would look consistent throughout the monospace universe. No matter what crappy
viewer people ended up using on it: as long as it supported a monospace font,
your code looked as intended.
I also did some work for [PEAR](https://pear.php.net/package/System-Daemon)
and they enforced spaces in their thorough [Coding Standards](https://pear.php.net/manual/en/standards.php).
Which was later adopted by many frameworks, sometimes with small deviations.

## Quest for the Holy Convention

After years of spaces in my code I started using CakePHP and their standard was tabs.
Nothing to get hung up over, but after a while my code started intermingling with other Cake
developer's code and that's when it gets a little
[hairy](https://github.com/kvz/cakephp-rest-plugin/blob/40fbe802d3d9eb526efb71eaa101efc3d6b82090/controllers/components/rest.php#L681).

So I started using tabs there cause in my view conventions are much like traffic rules as I mentioned before in
[SQL Formatting](/blog/2009/03/04/sql-formatting/)

> It's irrelevant if people drive on the right or left side of the road,
>   as long as they do the same

At [Transloadit](https://transloadit.com) we started using 2 spaces for JavaScript as it's
[the way of node.js](https://nodeguide.com/style.html#tabs-vs-spaces).
And then there's a little Ruby project I started hacking on and they also like 2 spaces.

## Adopt Many Conventions

Coding standards change. Within a project, organization, framework, and even language.
Or they change for you simply because you contribute to different aforementioned forms.

Instead of trying to enforce one preference throughout all of my projects, I
adopt the rules of the domain at hand. In this order:

```bash
Project > Organization > Framework > Language
```

(where conflicting, left wins from right)

On a side-note, I think it's the convention-designer's responsibility to align his conventions with
the *layer they're building on* as much as possible. So frameworks should look at their language.
companies should look at their framework. This makes for consistently looking codebases.
And that helps encouraging involvement. Nobody likes messing with code that suffers from poor housekeeping.

In order to be flexible about this, it helps a lot if your IDE supports per-project settings
(I currently use both [NetBeans](/blog/2008/12/02/my-new-ide-netbeans/)
& [Vim](/blog/2010/11/30/learning-vim/), and they do an fine job at that).
In NetBeans it's easy to mess up though cause it's pretty much indentation agnostic. So sometimes you
won't notice you're filling a 4-spaces file with tabs, ruining the code in other views/editors.

Once that happens, or maybe if you're porting big chunks of 'legacy' code to a new
standard that's closer to your *layer*, you'll need decent conversion scripts.

## Switch Conventions

There are many pages on converting spaces to tabs on Linux or Mac,
but I wasn't satisfied as they:

- Also change non-leading whitespace (which may not be what you want, e.g. a tab-indented document could still [use spaces to promote readability around assignments](https://pear.php.net/manual/en/standards.funcalls.php),
  or inside big strings)
- Don't support multiple levels of indention
- Can't be run from command-line (e.g. depend on IDE)
- Are specific to a language (`indent` / `astyle`)
- Messed up my indentation (`expand` / `unexpand`)

In an attempt to come up with a reliable tabs vs spaces converter that you
can simply run inside a directory and will traverse your source files, I'd like to share a couple
of lines of Bash.

### Warnings:

- Only do this when your source is under version control, these snippets make no backups!
  So execute, test, verify, commit. Or hit `git reset --hard` if you don't like it (leave a comment for improvement!)
- Currently processes `.php`, `.ctp`, `.js`, `.css`, `.sh`. But can easily be modified to do other extensions as well.

### Ubuntu

```bash
$ # 4 Spaces to tabs
$ find -P  . -type f -regextype egrep -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0  sed -i"" -e ':repeat; s/^\(\t*\)    /\1\t/; t repeat'

$ # extra: Strip any trailing whitespace
$ find -P  . -type f -regextype egrep -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0  sed -i"" -e 's/[[:blank:]]*$//g'

$ # extra: Strip any trailing blank lines (https://www.eng.cam.ac.uk/help/tpl/unix/sed.html)
$ find -P  . -type f -regextype egrep -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0  sed -i"" -e :a -e '/^\n*$/{$d;N;ba' -e '}'

$ # extra: Strip any trailing PHP closing tags
$ find -P  . -type f -regextype egrep -regex '.*\.(php|ctp)$' -print0 | xargs -0 sed -i"" -e :a -e '/^
*$/{$d;N;ba' -e '}'

$ # extra: Check the PHP files for syntax errors
$ find -P  . -type f -regextype egrep -regex '.*\.(php|ctp)$' -exec php -l {} \; > /dev/null
```

### Mac

On a Mac? You need GNU sed! - Read below.

```bash
$ # 4 spaces to tabs
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e ':repeat; s/^\(\t*\)    /\1\t/; t repeat'

$ # tabs to 2 spaces
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e ':repeat; s/^\(\(  \)*\)\t/\1  /; t repeat'

$ # extra: Strip any trailing whitespace
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e 's/[[:blank:]]*$//g'

$ # extra: Strip any trailing blank lines (https://www.eng.cam.ac.uk/help/tpl/unix/sed.html)
$ find -P -E . -type f -regex '.*\.(php|ctp|js|css|sh)$' -print0 | xargs -0 gsed -i"" -e :a -e '/^\n*$/{$d;N;ba' -e '}'

$ # extra: Strip any trailing PHP closing tags
$ find -P -E . -type f -regex '.*\.(php|ctp)$' -print0 | xargs -0 gsed -i"" -e :a -e '/^\n*$/{$d;N;ba' -e '}'

$ # extra: Check the PHP files for syntax errors
$ find -P -E . -type f -regex '.*\.(php|ctp)$' -exec php -l {} \; > /dev/null
```

### Run Into Problems?

Please let me know, I'll update the article so that these lines become the perfect converters.

## On a Mac? You need gnu-sed

Mac OSX (BSD) has a cripled `sed`.
This [illustrates](https://muzso.hu/2008/08/24/sed-in-darwin-leopard-is-crippled-in-many-ways) my point:

```bash
$ # On Mac:
$ echo "1|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat'
1|2|||5||7|

$ # On Linux:
$ echo "1|2|||5||7|" | sed -e ': repeat; s/||/|NULL|/; t repeat'
1|2|NULL|NULL|5|NULL|7|
```

Luckily you can get GNU sed for Mac OSX just as well.
[Get homebrew](https://github.com/mxcl/homebrew/wiki/installation), then run:

```bash
$ brew install gnu-sed
```

And change all of `sed` references to `gsed`.

## That's it

How do you deal with changing/different coding standards?
]]></content:encoded>
      <dc:date>2011-03-31T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Optimize Your Synology NAS for Downloading</title>
      <link>https://kvz.io/optimize-your-synology-for-downloading.html</link>
      <description><![CDATA[I recently bought a NAS so my data is safe &amp; available, with the benefit of being low
power / noise / heat.
I've considered Netgear, QNAP, but decided to go for a Synology
as it was affordable, still had a big community, decent reviews &amp; Time Machine support.
]]></description>
      <pubDate>Mon, 28 Feb 2011 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/optimize-your-synology-for-downloading.html</guid>
      <content:encoded><![CDATA[I recently bought a NAS so my data is safe & available, with the benefit of being low
power / noise / heat.
I've considered Netgear, QNAP, but decided to go for a Synology
as it was affordable, still had a big community, decent reviews & Time Machine support.

<!--more-->

I wanted 4 bays so that I could use RAID5 and only lose 25% space on fault tolerance
instead of (RAID1) 50%. Synology has 2 offerings in the 4-bay home-user range this year:
the `DS411+` (fast) and the `DS411j` (slow).

I figured as long as it can blast a bandwidth adequate for 1080p over my network,
I'd save myself some money (300$ vs 600$), heat and power consumption that
come with the more powerful `+` version.

However now that it's here I want it do download from newsgroups and am running into
performance issues with my `j`unior edition.

No worries. With a little bit of hacking you can squeeze just enough performance
out of this thing to make sense of it all.

Here's how I turned my budget NAS that's mediocre at 8 things into a more powerful one
that's good at 3 things: downloading / file serving / backups.

## Warning

This article assumes you're somewhat skilled in Linux. By applying these
suggestions you could seriously mess up your Disk Station.

I'm doing this on a `DS411j` running DSM 3.0. Your mileage may vary.

## Downloading

In [an earlier article](/blog/2011/01/28/install-sabnzbd-on-your-synology/) I
described how to install SABnzbd. After testdriving it for a while I was never able to
get it to download above 3MB/s (2 average). Where as `nzbget` (the program used by Synology's
own Download Station), peaks at 8MB/s (6 average).

Although I really like that SABnzbd automatically unpacks your downloads, these speed differences
made me decide to go back to nzbget. The `j` is just not powerful enough to do SABnzbd at these
speeds, and I can write auto-unpackers myself.

### Optimal Config

I found that optimal speeds can be reached by letting your Synology download with 8
connections on 1 single download. With these settings the load reaches 11, so don't
expect your NAS to do anything else while it's busy. But at least you're saturating
your connection.

If you want it to multitask, limit it to 1 connection on 1 single download at any time,
but you won't see it peak beyond 2MB/s.

If you use it for torrents as well, you don't want 1 slow torrent blocking the rest of
the queue. In that case, set it to 2 to 3 connections with 2 to 4 threads each for optimal downloading.

## Turn Off Unused Protocols

Decide on 1 file-sharing protocol (I chose Mac File service cause all my systems speak it
and use Time Machine, but SMB/Windows is typically the right choice).
Disable the rest in your configuration panel, saving a few precious MBs of RAM.

This is all just done from your web-interface.

## SSH Access

Before you can do any hacking on your Synology, turn on SSH access in the web-interface's
control panel.
You can now type: `ssh root@<nas ip>`. Followed by `sh`. The root password is the same as
admin password.

## AppStore :)

[Get your hands on ipkg](https://forum.synology.com/wiki/index.php/Overview-on-modifying-the-Synology-Server,-bootstrap,-ipkg-etc#How-to-install-ipkg),
which is like your Synology's secret AppStore. From here on, it's much easier to install cool
additional software.

## Turn Off Media Indexers to Free Up CPU & Memory

When I logged in to see what was eating up my NAS' resources, I saw a lot of processes running that I don't need
such as thumbnail generators and media indexers (`ffmpeg` & `convert`).
They were endlessly consuming 100% CPU, leaving nothing for my other tasks.

Any currently available NAS is a terrible media streamer.
And that's ok, just get yourself an AC Ryan ($80) or Boxee Box ($250) to do that instead and
dedicate your NAS to less tasks.

In my case that meant killing off all these wannabe media processes that are eating up your
poor handheld CPU with 128MB RAM (every MB we'll save from this point forward counts to faster
download speeds :)

So if you don't use the Photo/Media/iTunes station and would like more power for other
tasks, consider turning off indexers:

1) Turn off all services in the bottom configuration panel (iTunes, everything except
   Download Station, unless you're going to use
   [SABnzbd](/blog/2011/01/28/install-sabnzbd-on-your-synology/) for this)

2) Login as root via SSH and stop all indexing by pasting:

```bash
/usr/syno/etc/rc.d/S??synoindexd.sh stop
/usr/syno/etc/rc.d/S??synomkflvd.sh stop
/usr/syno/etc/rc.d/S??synomkthumbd.sh stop
killall -9 convert
killall -9 ffmpeg
# If you don't use Download Station (but e.g. SABnzbd instead):
# /usr/syno/etc/rc.d/S??pgsql.sh stop
```

3) Make sure they won't restart on your next reboot by pasting:

```bash
chmod a-x /usr/syno/etc/rc.d/S??synoindexd.sh
chmod a-x /usr/syno/etc/rc.d/S??synomkflvd.sh
chmod a-x /usr/syno/etc/rc.d/S??synomkthumbd.sh
# If you don't use Download Station (but e.g. SABnzbd instead):
# chmod a-x /usr/syno/etc/rc.d/S??pgsql.sh
```

Hint) After a DSM firmware upgrade, you need to repeat these steps.

## Custom Cleanup & Rename Script Cause SAB Is Too Slow

Building your own cleanup scripts can be fun (and risky).
If you want to get into it, you'll need some system tools at your disposal.

Here's what I cooked up to take care of my downloads:

```bash
#!/opt/bin/bash
# @todo: Don't delete parent dir if Dir == Root
# @todo: Root = $1 - But what about series!

set +x
export PATH="/opt/bin:/opt/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/syno/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/syno/bin:/usr/syno/sbin:/usr/local/bin:/usr/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/syno/bin:/usr/syno/sbin:/usr/local/bin:/usr/local/sbin"

# Locking
LockFile="/volume1/downloads/nas_is_unpacking.lock"
if [ -f "${LockFile}" ]; then
	echo "Lockfile still exists: ${LockFile}. Aborting"
	exit 0
fi
trap "{ rm -f ${LockFile} ; exit 255; }" EXIT
date > ${LockFile}

echo "Running ${0} on $(date)"

Root="/volume1/downloads"
Home="$(pwd)"
Purged=""

# Downloadstation
echo "Looking for downloadstation tasks..."
Prevdir=""
find ${Root}/_queue -mmin +5 -iname '*.nzb' -o -iname '*.torrent' |sort | while read File; do
	Dir="$(dirname "$File")"
	if [ "${Prevdir}" != "${Dir}" ]; then
		cd "${Dir}"
		echo ""
		echo "= $(pwd)"
		echo "================================================================================================"
	fi
	# Process the first par file in this directory thanks to |sort
	/opt/bin/downloadstation add "${File}"
	if [ $? -eq 0 ]; then
		echo "Successfully added ${File}; purging file"
		rm -f "${File}"
	else
		echo "Unable to add ${File}"
	fi
	Prevdir=$Dir
done
cd "${Home}"

# PAR
echo "Looking for files to repair..."
Prevdir=""
find ${Root} -mmin +5 -iname '*.par2' |sort | while read File; do
	Dir="$(dirname "$File")"
	if [ "${Prevdir}" != "${Dir}" ]; then
		cd "${Dir}"
		echo ""
		echo "= $(pwd)"
		echo "================================================================================================"
		# Process the first par file in this directory thanks to |sort
		par2 r "${File}"
		if [ $? -eq 0 ]; then
			echo "Successfully repaired; purging par files"
			rm -f *.par2
			rm -f *.PAR2
		else
			echo "Unable to repair; purging entire directory"
			Purged="${Purged}${Dir}\n"
			cd ..
			rm -rf "${Dir}"
		fi
	fi
	Prevdir=$Dir
done
cd "${Home}"

# RAR
echo "Looking for rar files to unpack..."
Prevdir=""
find ${Root} -mmin +5 -iname '*.rar' |sort | while read File; do
	Dir="$(dirname "$File")"
	if [ "${Prevdir}" != "${Dir}" ]; then
		cd "${Dir}"
		echo ""
		echo "= $(pwd)"
		echo "================================================================================================"
		# Process the first rar file in this directory thanks to |sort
		unrar e -y -o+ -p- "${File}"
		if [ $? -eq 0 ]; then
			echo "Successfully unpacked; purging rar files"
			rm -f *.rar
			rm -f *.r[0-9][0-9]
			rm -f *.s[0-9][0-9]
			rm -f *.t[0-9][0-9]
		else
			echo "Unable to unpack; purging entire directory"
			Purged="${Purged}${Dir}\n"
			cd ..
			rm -rf "${Dir}"
		fi
	fi
	Prevdir=$Dir
done
cd "${Home}"

# 7zip
echo "Looking for 7zip files to unpack..."
Prevdir=""
find ${Root} -mmin +5 -iname '*.7z.001' |sort | while read File; do
	Dir="$(dirname "$File")"
	if [ "${Prevdir}" != "${Dir}" ]; then
		cd "${Dir}"
		echo ""
		echo "= $(pwd)"
		echo "================================================================================================"
		# Process the first 7zip file in this directory thanks to |sort
		7z x "${File}"
		if [ $? -eq 0 ]; then
			echo "Successfully unpacked; purging rar files"
			rm -f *.7z.[0-9][0-9][0-9]
		else
			echo "Unable to unpack; purging entire directory"
			Purged="${Purged}${Dir}\n"
			cd ..
			rm -rf "${Dir}"
		fi
	fi
	Prevdir=$Dir
done
cd "${Home}"

# Move 1 Dir Up & Rename to Parent Dir
echo "Looking for files to clean..."
Prevdir=""
find ${Root} -mmin +5 -iname '*.mkv' -o -iname '*.avi' |sort | while read File; do
	Dir="$(dirname "$File")"
	Parent="$(dirname "$Dir")"
	if [ "${Prevdir}" != "${Dir}" ]; then
		cd "${Dir}"
		echo ""
		echo "= $(pwd)"
		echo "================================================================================================"
		rm -f *.1 2> /dev/null
		rm -f *.2 2> /dev/null
		rm -f *.nzb 2> /dev/null
		rm -f *.nfo 2> /dev/null
		rm -f *.par2_hellanzb_dupe0 2> /dev/null
		rm -f *.sfv 2> /dev/null
		rm -f *.srr 2> /dev/null
		rm -f *.segment000[0-9] 2> /dev/null
		# in the middle
		rm -f *[.-][Ss][Aa][Mm][Pp][Ll][Ee][.-]*.{mkv,avi,mpg,srs} 2> /dev/null
		# at the end
		rm -f *[.-][Ss][Aa][Mm][Pp][Ll][Ee].{mkv,avi,mpg,srs} 2> /dev/null
		# at the beginning
		rm -f [Ss][Aa][Mm][Pp][Ll][Ee][.-]*.{mkv,avi,mpg,srs} 2> /dev/null
		# complete
		rm -f [Ss][Aa][Mm][Pp][Ll][Ee].{mkv,avi,mpg,srs} 2> /dev/null

		# Synology media thumbs
		rm -rf @eaDir
	fi
	Prevdir=$Dir
done
cd "${Home}"

# Move Lonely Files 1 Dir Up & Rename to Parent Dir
echo "Looking for lonely files to promote 1 directory up..."
Prevdir=""
find ${Root} -mmin +5 -iname '*.mkv' -o -iname '*.avi' -o -iname '*.ts' |sort | while read File; do
	Dir="$(dirname "$File")"
	Parent="$(dirname "$Dir")"
	if [ "${Prevdir}" != "${Dir}" ]; then
		cd "${Dir}"

		if [ "$(ls -l |grep -v 'total ' |wc -l)" = "1" ]; then
			Basedir="$(basename "${Dir}")"
			Newname="$(echo "${Basedir}")"
			Ext=${File##*.}
			Newname="${Newname}.${Ext}"

			#cmd="mv \"${File}\" \"${Parent}/${Newname}\" && rmdir \"${Dir}\""
			mv "${File}" "${Parent}/${Newname}" && rmdir "${Dir}"
			echo "promoted: ${Parent}/${Newname}"
		fi
	fi
	Prevdir=$Dir
done
cd "${Home}"

## TV Episodes
# Please Use FileBot Instead. Much Better Results.
#if [ "${1}" = "tvnamer" ]; then
#	echo "Looking for tv episodes to rename..."
#	tvnamer -r --batch /volume1/video/series
#fi

# REPORT
if [ -n "${Purged}" ]; then
	echo ""
	echo "Had to purge these directories cause they were damaged beyond repair:"
	echo -e "${Purged}"
fi

echo "Done"
```

It runs every 15 minutes by cron, will remove broken downloads, unpack complete downloads, move lonely files 1 directory up,
delete a bunch of unwanted extensions, etc.
It makes a few assumptions (e.g. downloads must be in `/volume1/downloads`), so be sure to only use it for inspiration.

It's a work in progress, and improvements are more than welcome.

### Downloadstation CLI

To have your Synology scan a directory for new download tasks, you can use Downloadstation CLI.

```bash
$ ipkg install python24 py-pgsql py24-mx-base
$ curl https://downloadstation.jroene.de/downloadstation -ko /opt/bin/downloadstation \
 && chmod a+x $_
```

With the command

```bash
$ downloadstation add $nzbfile
```

The download will be added to the queue. If you use an adaptation of my unpacker script, it will already automatically scan `/volume1/downloads/_queue` for any new torrent or nzb task.

### Tools

These programs may take up a little bit of space, but won't be active in
memory until you call upon them (except for `cron`), so feel free to
install without performance loss:

```bash
$ ipkg install vim bash bash-completion less rsync mtr \
  sudo tshark htop openssl mlocate perl ack hdparm sysstat dstat \
  bzip2 unrar unzip zlib p7zip wget

$ curl https://raw.github.com/timkay/solo/master/solo -ko /usr/bin/solo \
 && chmod a+x $_
```

Optionally do `ipkg install clamav` so you can run `clamscan` on freshly downloaded files
and check them for viruses (I decided not to).

### Renaming Files

There's a neat program called [tvnamer](https://github.com/dbr/tvnamer) that will
rename all your TV series files.

Install:

```bash
$ ipkg install python25 py25-setuptools git \
 && cd /volume1/@tmp \
 && git clone https://github.com/dbr/tvnamer.git \
 && cd tvnamer \
 && python setup.py install \
 && ln -s /opt/local/bin/tvnamer /usr/bin/tvnamer \
```

Use:

```bash
$ tvrenamer -r /volume1/video/tv
```

[FileBot](https://filebot.sourceforge.net/) is even better but requires a GUI.

### Crontab

Crontab works slightly different than on more high-level Operating Systems.

Here's how to edit your crontab:

```bash
$ $EDITOR /etc/crontab
```

Every job needs a user prefix. e.g. `root`:

```bash
*/15 * * * * root /usr/bin/solo -port=1111 /volume1/video/unpacker.sh 1>&2 > /volume1/@tmp/unpacker.log
```

When you're done editing the new crontab, reload it by executing:

```bash
$ /usr/syno/etc.defaults/rc.d/S??crond.sh stop
$ /usr/syno/etc.defaults/rc.d/S??crond.sh start
```

### Tmux or Screen

If you start programs from within [tmux](https://tmux.sourceforge.net/), you can
close your SSH session without killing it. You can check back later on it
with `tmux attach || tmux`.

This makes it perfect to run cleanup/rename scripts in while you're still experimenting
and need to check up on them regularly.

Tmux similar to [screen](https://www.gnu.org/software/screen/manual/screen.html),
but I think it's a bit easier to deal with (just `tmux attach || tmux` is all).

However `screen` is a lot easier to install thanks to `ipkg`, so pick your poison.

#### Screen

```bash
$ ipkg install screen
```

#### Tmux

```bash
$ ipkg install libevent optware-devel ncurses-dev

# https://forum.synology.com/enu/viewtopic.php?f=90&t=30132
$ mkdir /opt/arm-none-linux-gnueabi/lib_disabled \
 && mv /opt/arm-none-linux-gnueabi/lib/libpthread* /opt/arm-none-linux-$ gnueabi/lib_disabled \
 && cp /lib/libpthread.so.0 /opt/arm-none-linux-gnueabi/lib/ \
 && cd /opt/arm-none-linux-gnueabi/lib/ \
 && ln -s libpthread.so.0 libpthread.so \
 && ln -s libpthread.so.0 libpthread-2.5.so

$ cd /volume1/@tmp \
 && wget https://sunet.dl.sourceforge.net/project/tmux/tmux/tmux-1.4/tmux-1.4.tar.gz \
 && tar -zxvf tmux-1.4.tar.gz \
 && cd tmux-1.4 \
 && export CC=gcc \
 && export CFLAGS="-L /opt/lib -I  /opt/include/ncurses" \
 && ./configure --prefix=/opt # prefix is not supported. So we'll need some symlinks \
 && make # This will take a while \
 && make install \
 && ln -s /opt/lib/libevent-1.4.so.2 /usr/lib/libevent-1.4.so.2 \
 && ln -s /opt/share/terminfo/* /usr/share/terminfo/
```
]]></content:encoded>
      <dc:date>2011-02-28T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Install SABnzbd, Sickbeard, Couchpotato on Your Synology DSM 3 NAS</title>
      <link>https://kvz.io/install-sabnzbd-on-your-synology.html</link>
      <description><![CDATA[The Synology ships with a Download Station but it's not remotely as
advanced as SABnzbd. What I mostly miss is automatic
par &amp; unpacking of it's downloads. Here's how to fix that.
]]></description>
      <pubDate>Fri, 28 Jan 2011 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/install-sabnzbd-on-your-synology.html</guid>
      <content:encoded><![CDATA[The Synology ships with a Download Station but it's not remotely as
advanced as [SABnzbd](https://sabnzbd.org/). What I mostly miss is automatic
par & unpacking of it's downloads. Here's how to fix that.

<!--more-->

## Warning

This article assumes you're somewhat skilled in Linux. By applying these
suggestions you could seriously mess up your Disk Station.

I'm doing this on a DS411j running DSM 3.0. Your mileage may vary.

## Maybe you don't want SABnzbd after all

In [a later article](/blog/2011/02/28/optimize-your-synology-for-downloading/) I demonstrate
how much higher downloads speeds can be achieved by synology's own `nzbget`, and a way
that you can still have automatic par & unpacking of it's downloads. I

## First of All

- Turn off the Download Station (Config screen)
- Turn on SSH acccess (Terminal)
- Login as root (same password as admin) via SSH and type: `sh`
- [install ipkg](https://forum.synology.com/wiki/index.php/Overview-on-modifying-the-Synology-Server,-bootstrap,-ipkg-etc#How-to-install-ipkg),
  which is like your Synology's secret AppStore.

## Install SABnzbd & the Family

If you paste the following it installs [SABnzbd](https://sabnzbd.org/) as root. There are some
serious risks involved with that so you may want to
change it to someone with less permissions.

```bash
ipkg install bzip2 par2cmdline unrar unzip zlib git
ipkg install py26-cheetah py26-openssl python26
install sabnzbdplus

cd /opt/local
[ -d sickbeard/.git ] || git clone git://github.com/midgetspy/Sick-Beard.git  sickbeard
cd sickbeard
git pull

cd /opt/local
[ -d couchpotato/.git ] || git clone git://github.com/RuudBurger/CouchPotato.git couchpotato
cd couchpotato
git pull
```

## Config

Most can be done via the webinterfaces, except for the standard port of CouchPotato,
which conflicts with the Synology interface (`:5000`).

Open the file `/opt/local/couchpotato/config.ini`, look  for the `[global]` header and change
the `port` key to: `9300`

```bash
[global]
port = 9300
```

## Startup

Create a startup file:

```bash
$ cat << EOT > /usr/syno/etc/rc.d/S99SABnzbd.sh
#!/usr/bin/env sh
if [ "start" = "\$1" ]; then
  /opt/bin/python2.6 /opt/share/SABnzbd/SABnzbd.py -f /root/.sabnzbd/sabnzbd.ini -s 0.0.0.0:9200 -d
  /opt/bin/python2.6 /opt/local/couchpotato/CouchPotato.py --config=/opt/local/couchpotato/config.ini --datadir=/volume1 -d
  /opt/bin/python2.6 /opt/local/sickbeard/SickBeard.py --quiet --port 9400 --config /opt/local/sickbeard/config.ini --datadir=/volume1 --daemon
elif [ "stop" = "\$1" ]; then
  /usr/bin/killall -9 python2.6
elif [ "restart" = "\$1" ]; then
  \$0 stop
  \$0 start
elif [ "" = "\$1" ]; then
  echo "Start, stop or restart service? Use a parameter..."
fi
EOT
```

Make it executable:

```bash
$ chmod a+x /usr/syno/etc/rc.d/S99SABnzbd.sh
```

And run it:

```bash
$ /usr/syno/etc/rc.d/S99SABnzbd.sh start
```

Connect to:

- <https://[your_nas_ip]:9200/> for SABnzbd
- <https://[your_nas_ip]:9300/> for CouchPotate
- <https://[your_nas_ip]:9400/> for Sickbeard

Next time your Synology boots, SABnzbd and it's family will boot as well.
]]></content:encoded>
      <dc:date>2011-01-28T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Sync Vim Config Across Workspaces</title>
      <link>https://kvz.io/sync-vim-accross-workplaces.html</link>
      <description><![CDATA[As a Vim newbie, I'd like my Vim plugins &amp; configuration
to stay in sync between machines at home, office, my servers &amp; a laptop.
]]></description>
      <pubDate>Fri, 03 Dec 2010 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/sync-vim-accross-workplaces.html</guid>
      <content:encoded><![CDATA[As a [Vim newbie](/blog/2010/11/30/learning-vim/), I'd like my Vim plugins & configuration
to stay in sync between machines at home, office, my servers & a laptop.

If found that a (free)
[Dropbox](https://www.dropbox.com/referrals/NTM0MzU3OTk?src=global0 "Register with this link to get me more space")
account works like a charm.

<!--more-->

First copy your best config to your Dropbox:
![logo.png](/assets/images/posts/1-logo.png "logo.png")

```bash
$ # In this setup, your config will be accessible by the world
$ mkdir -p  ~/Dropbox/Public/configs/vim
$ cp -raf ~/.vim  ~/Dropbox/Public/configs/vim/.vim
$ cp -raf ~/.vimrc  ~/Dropbox/Public/configs/vim/.vimrc
```

Next, on all machines, purge the currently installed vim config:

```bash
$ mv ~/.vim* /tmp/
```

.. and symlink the Dropbox's config in place:

```bash
$ ln -nfs ~/Dropbox/Public/configs/vim/.vim ~/.vim
$ ln -nfs ~/Dropbox/Public/configs/vim/.vimrc ~/.vimrc
```

Now when you change your Vim config anywhere, changes will be synced to all the
workstations that you have set up this way.

## Get Config on Untrusted Machines

Extra perk: because we use the Dropbox Public folder, you can even access your
config on machines that aren't connected to your Dropbox by using `wget`:

```bash
$ curl https://dl.dropbox.com/u/XXXXXX/configs/vim/.vimrc -ko ~/.vimrc
```

- Change XXXXXX with user id (or right click the config, copy public link)
- Just the main config, no plugins (use a zip or web interface for that)

## Trusted Servers (No GUI)

On trusted servers, you can even [install Dropbox without a GUI](https://ddorda.useopensource.net/?p=1259) :D

### 32bits

```bash
$ pushd /usr/src
$ curl https://www.dropbox.com/download/?plat=lnx.x86 -ko dropbox.tar.gz
$ tar -zxvf $_
$ .dropbox-dist/dropboxd &
$ popd
```

### 64bits

```bash
$ pushd /usr/src
$ curl https://www.dropbox.com/download/?plat=lnx.x86_64 -ko dropbox.tar.gz
$ tar -zxvf $_
$ .dropbox-dist/dropboxd &
$ popd
```

### Then

Follow the instructions on screen.

The Dropbox folder is located at `~/Dropbox` as usual.

To make sure Dropbox will load on startup run:

```bash
$ dropbox autostart
```

Afterwards symlink the config like you would normally.
]]></content:encoded>
      <dc:date>2010-12-03T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Learning Vim</title>
      <link>https://kvz.io/learning-vim.html</link>
      <description><![CDATA[In an attempt to familiarize myself with the unfamiliar, I decided to build
a fun side-project in Ruby and Vim.
Effectively learning a new language, framework, and editor.
]]></description>
      <pubDate>Tue, 30 Nov 2010 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/learning-vim.html</guid>
      <content:encoded><![CDATA[In an attempt to familiarize myself with the unfamiliar, I decided to build
a fun side-project in [Ruby](/blog/2010/09/21/ruby-with-nginx-on-ubuntu-lucid/) and Vim.
Effectively learning a new language, framework, and editor.

Coming from Nano, Quanta,
[Eclipse PDT](/blog/2008/04/11/my-new-ide-eclipse-pdt/),
TextMate,
[Netbeans](/blog/2008/12/02/my-new-ide-netbeans/); I found (Mac/g)Vim is big a step, and
first two weeks you should not expect to be productive.

<!--more-->

But after some persistence, I'm now faster in Vim than I was in my previous
editors. Except for NetBeans still. Maybe that changes as I get better,
but having an editor that understands your code is also a powerful thing.
Scope-aware refactoring and jumping to declarations make that I keep both
NetBeans and Vim around. Cause Vim on the other hand let's you navigate and
hack on text like no other.

Investing in a tool like this pays off. For life.

Cause as so long as computers can't read our minds, we're better of maximizing
the efficiency of our typing. If your brain has to wait for your hands to
transfer the message, you're just throwing away time & creativity flow.

Here are the resources that helped me get started with Vim.

Vim screencast tutorial (awesome series):

- [#1 Basics](https://www.youtube.com/watch?v=c6WCm6z5msk)
- [#2 Motions and Commands](https://www.youtube.com/watch?v=BPDoI7gflxM)
- [#3 Search, Find and Replace](https://www.youtube.com/watch?v=J1-CfIb-3X4)

Other screencasts:

- [Top Vim Plugins](https://www.youtube.com/watch?v=-galFWwSDt0)

More resources:

- [Vim Keyboard Cheat Sheet](/assets/images/posts/2010-11-30-learning-vim-1.gif)
- [Vim Cheat Sheet](https://www.fprintf.net/vimCheatSheet.html)
- [Vim tips: Using tabs](https://www.linux.com/archive/articles/59533)
- [Use Vim like a pro](https://tottinge.blogsome.com/use-vim-like-a-pro)
- [Sync Vim Config across workspaces](/blog/2010/12/03/sync-vim-accross-workplaces/) (on this blog)

## Keyboard Shortcuts

There are many Vim cheat-sheets out there, better than this one (see resources
above). But still, I'll continuously log useful shortcuts here so I won't
regress :)

### Mode Switching

```bash
^[  exit mode
:   enter command mode

i   enter insert mode before cursor
a   enter insert mode after cursor
```

### Command Mode

```bash
w [filename]  write file
e filename    edit file
e!            reload current file

<range>s      substitute
.             repeat commands
```

### Normal Mode

This is the mode you should generally be in. Don't stick around in others
longer than necessary.

#### Movement / Motions

```bash
j  move down
k  move up
h  move left
l  move right

0  move to line start
^  move to first char in line
$  move to last char in line

}  move down a paragraph
{  move up a paragraph

^d down a page
^u up a page

gg move to top
G  move to bottom

#  prev word like current
*  next word like current

%  move to matching closing/starting tag/comment/brace/statement
"  move to your last edit

b  back a word
w  to next word
W  to next WORD (space terminated, ignore e.g. commas)
e  to end of the current word
E  end of the current WORD (space terminated)
```

#### Manipulation

```bash
u          undo
x          delete character (and saves to clipboard)

c<motion>  change (enters insert mode)
y<motion>  copy (yank)
d<motion>  cut (delete) (and saves to clipboard)

p          paste after cursor
P          paste before cursor

<          unindent
>          indent
```

#### Search & Replace

```bash
/   start search down the document
?   start search up the document
^M  stop  search

n   move to next occurance
N   move to prev occurance

f   find symbol after cursor
F   find symbol before cursor
;   move to next occurance of symbol
t   find until symbol after cursor

m<bookmark> create bookmark. e.g.: m1
`<bookmark> jump to bookmark (exact cursor position)
'<bookmark> jump to bookmark (beginning of line)
`.          jump to last edit (exact cursor position)
'.          jump to last edit (beginning of line)
:marks      show bookmarks
```

#### Ranges

```bash
%  entire document
```

#### Combined

```bash
ctK                 change until next capital 'K'
yy                  copy (yank) line
d2e                 delete till the end of the second word

%s/name/kevin/g     change all occurances of name to kevin in entire document
3,6 s/name/kevin/g  change all occurances of name to kevin from line 3 to 6

D                   delete til end of line
Y                   copy (yank) til end of line

ct,                 change until ','
cf,                 change until & including ','

>7j                 indent this line & 7 next lines
<}                  unindent until next paragraph
3dd                 delete 3 lines (3x delete current line)
12x                 delete 12 characters
```

### Insert Mode

```bash
^n                  Next autocomplete suggestion
^p                  Previous autocomplete suggestion
```

### Shell Filtering

```bash
!            enter mode
!!<command>  apply filter on current line
!}sort       sort paragraph
:%!sort      sort entire file
!G           sort until bottom
```

Alright, that's all I got for now. What are your experiences with Vim?
]]></content:encoded>
      <dc:date>2010-11-30T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>5,000,000 Visitors = Free Beer for Switzerland!</title>
      <link>https://kvz.io/5000000-visitors-free-beer-for-switzerland.html</link>
      <description><![CDATA[When I started this techblog in 2007 and got my first 500 real visitors, I was in
the clouds. If you told me then I'd hit the 5,000,000 visitor milestone 3 years later,
I would have probably slapped some sense into you.
]]></description>
      <pubDate>Fri, 15 Oct 2010 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/5000000-visitors-free-beer-for-switzerland.html</guid>
      <content:encoded><![CDATA[When I started this techblog in 2007 and got my first 500 real visitors, I was in
the clouds. If you told me then I'd hit the 5,000,000 visitor milestone 3 years later,
I would have probably slapped some sense into you.

Not in my wildest dreams did I imagine my little side-project would take off like this.

<!--more-->

Yet here we are.

To celebrate I wanted to give a prize to the exact 5th million visitor, but all I know
is that he/she is from Switzerland (the visit came via google.ch). So instead I'll give **free beer**
of choice to the first person from **Switzerland** to leave a comment.

To all other nations & visitors: 1 Free beer for the best [Swiss](https://en.wikipedia.org/wiki/Swiss-people)
imitation. But more importantly, a big thanks to you for taking the time to give so much
feedback and help me improve my writing and understanding of the topics I share about.

I've learned a lot from you guys and it's humbling to be surrounded by so many bright people.
I really hope to keep writing at least 1 article a month for as long as I can
type. And I hope that you will keep voicing your opinions & improvements. Thanks everyone.

> Update: The beer was handed out to a Swiss CakePHP coder on CakeFest (2011 in Manchester?), with [Marc Ypes](https://github.com/ceeram) as my witness :smile:
]]></content:encoded>
      <dc:date>2010-10-15T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Running Ruby on Rails on Nginx</title>
      <link>https://kvz.io/ruby-with-nginx-on-ubuntu-lucid.html</link>
      <description><![CDATA[If you want to set up Ruby on Rails on Ubuntu Lucid from scratch, there are
quite
some
articles online to choose from. I found most of them involve compiling,
only highlight 1 aspect, or are a bit outdated.
]]></description>
      <pubDate>Tue, 21 Sep 2010 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/ruby-with-nginx-on-ubuntu-lucid.html</guid>
      <content:encoded><![CDATA[If you want to set up Ruby on Rails on Ubuntu Lucid from scratch, there are
[quite](https://www.blog.bridgeutopiaweb.com/post/rails-3-on-ubuntu-karmic-koala-fivebean/)
[some](https://www.modrails.com/install.html)
[articles](https://articles.slicehost.com/2008/5/27/ubuntu-hardy-nginx-rails-and-mongrels) online to choose from. I found most of them involve compiling,
only highlight 1 aspect, or are a bit outdated.

On top of that, getting it right can be hard as there are a number of
[issues](https://www.lucas-nussbaum.net/blog/?p=566) related to
Ruby and Debian/Ubuntu.

This is an attempt to put all the sweet info in 1 place.

<!--more-->

## Ruby

```bash
$ export PATH="${PATH}:/var/lib/gems/1.8/bin/"
$ echo 'export PATH="${PATH}:/var/lib/gems/1.8/bin/"' >> /etc/bash.bashrc
$ aptitude install ruby rubygems vim-ruby ruby-dev libzlib-ruby \
libyaml-ruby libreadline-ruby libncurses-ruby rdoc ri libcurses-ruby \
libruby libruby-extras libfcgi-ruby build-essential libopenssl-ruby \
libdbm-ruby libdbi-ruby libxml-ruby libxml2-dev
```

## Rails

Simple:

```bash
$ gem install -v=2.3.5 rails
```

`gem install rails` should have worked but 2.3.6 - 2.3.8 (current at the of writing) [have issues with mongrel](https://rails.lighthouseapp.com/projects/8994/tickets/4690)

Or, if you want to live on the edge and try the latest:

```bash
$ gem install rails --pre
```

### Or With RVM

RVM is a command line tool which allows us to easily install, manage and work with multiple ruby environments from interpreters to sets of gems.
See [installation instructions](https://rvm.beginrescueend.com/rvm/install/) and a [full tutorial](https://web2linux.com/installing-rails-3-on-ubuntu-10-04-lucid-lynx/) on that.

## App

My new app is called myapp.example.com

```bash
$ cd /var/www
$ rails new myapp.example.com
$ cd myapp.example.com
```

Have a look around and see what you can `find .`

## Thin

Thin will be the Ruby server

```bash
$ gem install thin
$ thin install
$ /usr/sbin/update-rc.d -f thin defaults
$ thin config -C /etc/thin/myapp.example.com -c /var/www/myapp.example.com --servers 3 -e development # or: -e production for caching, etc
```

### Or Mongrels

If you don't like Thin..

```bash
$ aptitude install mongrel mongrel-cluster
mongrel_rails cluster::configure -e development -p 3000 -N 3 -c /var/www/myapp.example.com -a 127.0.0.1 # or: -e production for caching, etc
$ mkdir /etc/mongrel_cluster
$ sudo ln -nfs /var/www/myapp.example.com/config/mongrel_cluster.yml /etc/mongrel_cluster/myapp.example.com.yml
$ #sudo ln -nfs /var/www/myapp.example.com/config/mongrel_cluster.yml /etc/mongrel-cluster/sites-enabled/myapp.example.com.yml
```

## Nginx

Nginx will be the Web server, proxing ruby requests to thin, running on ports 3000-3002
If you haven't installed it yet, do

```bash
$ aptitude install nginx
```

Now that you have Nginx, create a vhost.
Edit `/etc/nginx/sites-available/myapp.example.com` and type:

```bash
upstream myapp {
  server 127.0.0.1:3000;
  server 127.0.0.1:3001;
  server 127.0.0.1:3002;
}
server {
  listen   80;
  server_name .example.com;

  access_log /var/www/myapp.example.com/log/access.log;
  error_log  /var/www/myapp.example.com/log/error.log;
  root     /var/www/myapp.example.com;
  index    index.html;

  location / {
    proxy_set_header  X-Real-IP  $remote_addr;
    proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header  Host $http_host;
    proxy_redirect  off;
    try_files /system/maintenance.html $uri $uri/index.html $uri.html @ruby;
  }

  location @ruby {
    proxy_pass https://myapp;
  }
}
```

Then make it available to the public

```bash
$ ln -nfs /etc/nginx/sites-available/myapp.example.com /etc/nginx/sites-enabled/myapp.example.com
```

## Databases

First set up SQLite

```bash
$ aptitude install -y libdbd-sqlite3-ruby sqlite3 libsqlite3-dev libsqlite3-ruby
gem install sqlite3-ruby
```

### MySQL?

Optionally if you want to use MySQL install the following (but do sqlite anyway):

```bash
$ aptitude install -y libmysqlclient-dev
$ gem install mysql
```

Then change your `/var/www/myapp.example.com/config/database.yml` and make it say something along the lines of

```yaml
development:
  adapter: mysql
  host: localhost
  database: myapp
  username: myapp
  password: xxxxxxx
```

Note! `database.yml` doesn't accept tabs. If you are in vim, you might need to do:

```bash
:set expandtab
#:set tabstop=4 # how many spaces should tabs be replaced withs
:retab
```

Also, make you app require the mysql gem by adding the following to `./Gemfile`

```bash
gem 'mysql', '2.8.1'
```

I am assuming you already have a mysql-server running.
If not, you also need to `aptitude install mysql-server` first.

## Nice Gems

```bash
$ gem install \
 uuidtool \
 ruby-debug \
 ruby-graphviz \
 json \
 activemerchant
```

## Bring App Live

Let's restart our daemons to see if it worked:

For Thin:

```bash
$ /etc/init.d/thin restart && /etc/init.d/nginx reload; tail -f log/*.log
```

For Mongrel:

```bash
$ mongrel_cluster_ctl restart && /etc/init.d/nginx reload; tail -f log/*.log
```

Add this line to above the 2 default routes in `config/routes.rb`:

```ruby
# Rails 2
map.root :controller => "home"
# Rails 3
#root :controller => "home#index"
```

Create a home controller, add a view for it, and remove the 'Welcome aboard' html.

```bash
$ script/generate controller home index
$ rm public/index.html
$ echo '<h1>HeyO!</h1><object width="640" height="385"><param name="movie" value="https://www.youtube.com/v/9X2u2cdvJSg?fs=1&amp;hl=en_US"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="https://www.youtube.com/v/9X2u2cdvJSg?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"></embed></object>' > app/views/home/index.erb
```

If you don't get any errors, point your browser to the Vhost you created,
and you should see a pleasant surprise.

Our work is done here.

## New to Ruby?

Here's some resources to help you further:

- [Get started](https://guides.rubyonrails.org/getting-started.html)
- [Starting Ruby on Rails: What I Wish I Knew](https://betterexplained.com/articles/starting-ruby-on-rails-what-i-wish-i-knew/)
- [Agile Web Development with Rails, 4th Edition](https://pragprog.com/titles/rails4/agile-web-development-with-rails)
- [Programming Ruby 1.9: The Pragmatic Programmers' Guide](https://pragprog.com/titles/ruby3/programming-ruby-1-9)
- [Learning Ruby - WITH THE EDGECASE RUBY KOANS](https://rubykoans.com/)
- [UCBerkeleyEvents: Ruby on Rails: Part 1: Hello world](https://www.youtube.com/watch?v=LADHwoN2LMM)

If you think I missed something, please do leave a comment and help me improve this article.
]]></content:encoded>
      <dc:date>2010-09-21T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>HAProxy Logging in Ubuntu Lucid</title>
      <link>https://kvz.io/haproxy-logging.html</link>
      <description><![CDATA[At Transloadit we use
HAProxy "The Reliable, High Performance TCP/HTTP Load Balancer" so that we can offer different services on 1 port.
]]></description>
      <pubDate>Wed, 11 Aug 2010 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/haproxy-logging.html</guid>
      <content:encoded><![CDATA[At [Transloadit](https://transloadit.com) we use
[HAProxy](https://haproxy.1wt.eu/) "The Reliable, High Performance TCP/HTTP Load Balancer" so that we can offer different services on 1 port.

For instance, depending on the hostname, a requests to port 80 can be routed to either [nodejs](https://nodejs.org/) (in case of [api.transloadit.com](https://api.transloadit.com)), or [nginx](https://nginx.net/) (in case of [www.transloadit.com](https://www.transloadit.com)).

HAProxy has been good to us and setting it up was a breeze. But getting HAProxy to log on [Ubuntu](https://www.ubuntu.com/) Lucid was harder than I thought.
All of the tutorials I found either didn't cover logging, or had deprecated information on it.

Google suddenly stopped being my friend.

<!--more-->

## HAProxy Wants to Log

For performance & maintenance reasons HAProxy doesn't log directly to files. Instead it wants to log
against a syslog server. This is a separate Linux daemon that most servers are equiped with already,
but HAProxy requires it to listen on UDP port 514, and usually that's not enabled.

A syslog server:

- receives log entries
- decides what's interesting
- writes it to disk in a highly optimized way

these aspect can all be configured by you.

If we look at the top of your current `/etc/haproxy/haproxy.cfg` file, we may find something like:

```bash
global
        maxconn         10000
        ulimit-n        65536
        log             127.0.0.1 local1 notice
```

In your backends or default config, refer to `global`:

```bash
defaults
    log             global
```

As you can see `127.0.0.1` is where it will try to find a syslog server to log to.
On Unbuntu Lucid the default syslog daemon is [rsyslogd](https://manpages.ubuntu.com/manpages/hardy/man8/rsyslogd.8.html),
so let's make it accept HAProxy log entries.

## Rsyslogd Welcomes HAProxy

Most google hits I found on logging with HAProxy told me to change the `/etc/default/rsyslog` file, but
that's completely ignored with the new [upstart](https://upstart.ubuntu.com/) system.
And even if you make it adhere the defaults file (yep, I tried), it will make
rsyslogd go down in compatibility mode. Which is not only a shame, but also
unnecessary as it turns out.

Using these config lines:

```bash
$ModLoad imudp
$UDPServerAddress 127.0.0.1
$UDPServerRun 514
# Thanks Joeri Blokhuis of DongIT, pointing out that UDPServerAddress needs to
# go before UDPServerRun, or the server will run on 0.0.0.0
```

rsyslogd will open up it's UDP port.

Where to put these lines you say? Well, if HAProxy is the only service you need the
UDP syslog port for, you could put/uncomment the lot in just
one `/etc/rsyslog.d/49-haproxy.conf` file (Thanks to [Gilles for the '49-' prefix](https://kvz.io/blog/2010/08/11/haproxy-logging/#comment-711870848)):

```bash
# .. otherwise consider putting these two in /etc/rsyslog.conf instead:
$ModLoad imudp
$UDPServerAddress 127.0.0.1
$UDPServerRun 514

# ..and in any case, put these two in /etc/rsyslog.d/49-haproxy.conf:
local1.* -/var/log/haproxy_1.log
& ~
# & ~ means not to put what matched in the above line anywhere else for the rest of the rules
# https://serverfault.com/questions/214312/how-to-keep-haproxy-log-messages-out-of-var-log-syslog
```

Now do a quick:

```bash
$ restart rsyslog
```

And you're done. Check for HAProxy logs in:

```bash
$ tail -f /var/log/haproxy*.log
```

Don't forget to tweak the debug level in `/etc/haproxy/haproxy.cfg`, and maybe set up a logrotate right away in `/etc/logrotate.d/haproxy`:

```bash
/var/log/haproxy*.log
{
    rotate 4
    weekly
    missingok
    notifempty
    compress
    delaycompress
    sharedscripts
    postrotate
        reload rsyslog >/dev/null 2>&1 || true
    endscript
}
```

Happy logging!
]]></content:encoded>
      <dc:date>2010-08-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Announcing transloadit.com</title>
      <link>https://kvz.io/announcing-transloadit.html</link>
      <description><![CDATA[Today we are very happy to announce the commercial availability of
transloadit.com.
]]></description>
      <pubDate>Tue, 13 Jul 2010 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/announcing-transloadit.html</guid>
      <content:encoded><![CDATA[Today we are very happy to announce the commercial availability of
[transloadit.com](https://transloadit.com).

<!--more-->

If you have a web (or mobile) application that needs file uploading you should
consider integrating transloadit.
Transloadit will handle the upload process, resizing of images, encoding of
videos, error handling
and final storage of your content on Amazon S3 for you.

[Our plans](https://transloadit.com/plans) start at $19 / month which includes
3.5 GB of usage. This is enough for ~72 video encodings or ~717 image resizes
per month.

This project has been almost two years in the making, with over 150 people
participating in testing various versions. The version we are shipping now
has already executed 55.000 internal jobs, each spawning 2-5 command
line scripts on our servers.

We are also the first commercial software / infrastructure as a service product
build on [node.js](https://nodejs.org/). After experimenting with various
technologies, we found it to be the perfect fit for our uploading and processing
requirements.

Another thing we are very proud of is the ~95% of test coverage of the service's
code base. We have an extensive suite of unit, integration and system tests
that have already proven incredibly reliable for detecting problems, be it in
our code, or changes to our stack.

If you are a long time reader of this blog, we would feel incredibly grateful
if you would spread the word about our service to your boss, co-workers and
geek-friends.

Otherwise we would be very happy to hear as much feedback, ideas and questions
as you can come up with!

**This is a double-post, the original is by my dear co-founders over at
[debuggable](https://debuggable.com/posts/announcing-transloadit-com:4c3c6a45-3950-4b13-a044-44a0cbdd56cb).**

**Go and say hi! :)**
]]></content:encoded>
      <dc:date>2010-07-13T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Notes on Dutch PHP Conference 2010</title>
      <link>https://kvz.io/notes-on-dpc10.html</link>
      <description><![CDATA[Here the notes I took during the Dutch PHP conference 2010 (#dpc10). They're not a representative
summary of the event's highlights cause I could only attend 1 of 4 talks at any given time.
]]></description>
      <pubDate>Sun, 20 Jun 2010 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/notes-on-dpc10.html</guid>
      <content:encoded><![CDATA[Here the notes I took during the Dutch PHP conference 2010 (#dpc10). They're not a representative
summary of the event's highlights cause I could only attend 1 of 4 talks at any given time.

I also filtered out things that didn't interest me personally.

<!--more-->

## Conference Day 1 - June 11 2010

### 97 things every programmer should know - Kevlin Henney ([@KevlinHenney](https://twitter.com/KevlinHenney))

- Deliberately do non-productive work to improve your skills. Work outside your subset.
  You could have 6 years experience or, 12x 6 months if you do the same trick over & over.
- Make tag-clouds of your project's code every build to see in which language domain you're coding.
  It's a good sign if you're seeing the subject of your project in the top list.
- Compare technical debt to sprinting in a 10km run. Only do if you have time to recover afterwards.
- Software really is soft and open to speculation. Tests are what ground it.

### Designing for Reusability - Derick Rethans ([@derickr](https://twitter.com/derickr))

- Reusability is good, but you can over-abstract. The hammer-factory-factory-factory example.
- Splitting out code in more methods makes it much easier to extend & test them.
  If you don't succeed, try test-driven development.
- PHP 5.4(?) Traits is basically compiler assisted copy-pasting. You can "use Classname;" to base a
  class on multiple others.

### Technical Debt - Elizabeth Naramore ([@ElizabethN](https://twitter.com/ElizabethN))

- Checkout the [Technical Debt Quadrant by Martin Fowler](https://martinfowler.com/bliki/TechnicalDebtQuadrant.html "Reckless vs Prudent. Deliberate vs Inadvertent") by Martin Fowler
- [sonarsource.com](https://sonarsource.com) will beat PHPUnderControl in continuous integration after Sebastian Bergmann has
  finished PHP support.

Why? sonar will show you the technical debt in $$. Great for managers to realize how important it is
to refactor.

- Visibility is key. Technical debt issues should go into your bug tracker / todo list
- Kevlin Henney ([@KevlinHenney](https://twitter.com/KevlinHenney)) mentions: Put every debt issue up on a wall. There's a physical satisfaction in taking them
  down. Also, with software tools it's too easy to fold it away.

### The art of Scalability - Lorenzo Alberton ([@lorenzoalberton](https://twitter.com/lorenzoalberton))

- Always think of concepts instead of solutions, it will make it easier to switch in architecture.
- Build monitoring into your app. Not just around it.
- Common mistake in logging: Never log more than you can process in real time. Signal to Noise ratio.

### Database version control without pain - Harrie Verveer ([@harrieverveer](https://twitter.com/harrieverveer))

You could use Patch-files (`patch-001.sql`) for shipping changes (deltas)

- A good idea is to also `UPDATE options SET value = 1 WHERE key = 'db_patch_level'`
  This way, you know from which point your DB still needs patches.
- Never modify a patch-file once it's under version control.
  When you commit a mistake in a patch. You'll need another one to fix it.
- Use separate `.sql` files for content
- Process by hand is tedious. Automate it: Tiny script that reads DB version,
  then iterates over necessary patches. Maybe run it right after every git pull.

Downsides of patching:

- Branching can get hairy. What to do with 2 different files called: patch-004.sqls.
- You can use date-stamps, but then you'd have to keep a log. And you still don't know
  if it will fit in between the other branch's patches
- A false sense of security

So to really nail database versioning, you're better off using tools.

- [Dbdeploy](https://dbdeploy.com): pear install phing ([Phing](https://phing.info) comes with Dbdeploy)
- [Liquibase](https://www.liquibase.org/): Knows what's happening cause definitions are in XML not SQL. So no undo files needed. Massive DB support. Great docs. XML :(
- Akrabat: If you use ZF, [check it on github](https://github.com/akrabat/Akrabat)
- Doctrine: Great migrations, but only if you already use [Doctrine ORM](https://www.doctrine-project.org/)
  (a layer between app & db. Update doctrine objects, and it will update your DB accordingly)
- CakePHP Migrations: Not mentioned in the talk, but of course there is
  [Cake Migrations by CakeDC](https://cakedc.com/downloads/view/cakephp-migrations-plugin)

### Testing untestable code - Stephan Hochdoerfer ([@shochdoerfer](https://twitter.com/shochdoerfer))

3 factors to untestable code:

- Hardcoded dependencies
- Global state / variables. Also applies to singletons & registries
- Communication with external sources

You'd want to refactor the legacy stuff, but without tests first you'll only introduce more bugs.
So, we really need tests before refactoring and you'll have to defeat the the 3 factors that make
code untestable.
You could use UI testing with e.g. [selenium](https://seleniumhq.org/) but the UI won't expose all code.
Desperate times call for desperate measures; Here are some dirty tricks that let you mock dependencies,
fake filesystems, and override internal functions as to safely test untestable code.

- Have the code load dependencies from `./custom/mock/` by overriding `__autoload()` or `include_path`
- Even freakier: `stream_wrapper_register()` your own file stream and have it return mocked files
- You can mock a filesystem with [vfsStream](https://code.google.com/p/bovigo/wiki/vfsStream)
- PHPUnit can setup & teardown any global state that's required
- unload e.g. the mysql.so extension and write your own `mysql_query()` functions
- Overriding internal PHP functions like `mail()` is harder but not impossible.
  Use classkit to `runkit_function_redefine()` it

### Crash, Burn, Recover! - Cal Evans ([@calevans](https://twitter.com/calevans))

Running late, I picked room by by speaker & title without having read the intro. Turned out to
be a live session on Adobe Flex. Maybe not my first choice but I decided stay and
keep an open mind.

Adobe did good to recruit Cal, his enthusiasm is contagious.
Still: Flash, heavily namespaced XML, Eclipse. It's all a bit to closed, bloated & slow for my taste.
The resulting program (a desktop recipe browser using a webservice) could have just as easily
been built in Capucinno (yes, even as native app, checkout [github issues](https://github.com/blog/650-github-issues-cappuccino-style)) and then
It'd just be HTML & JS.

So sorry Cal, but I won't be needing one of those free licenses you were handing out.

## Conference Day 2 - June 12 2010

### Security Centered Design - Chris Shiftlett ([@shiftlett](https://twitter.com/shiftlett))

Chris is a great speaker and today was no different. Nothing too technical, but overall a very
inspiring talk on shortcomings in the human mind, and how that can cause unforeseen security risks.
Great start of the day.

- Change blindness can happen when there's just a small (loading) gap between pages, caused by your brain
  continuously flushing a great deal of data. Ajax can help counter that.
- Don't try to modify your users expectations and tendencies, meet them. Not because you're so nice, but
  because you're lazy. "Pave the cowpath".
- Never be arrogant about security, you're only going to attract more & more people
  wanting to knock you down.

### Real world dependency injection - Stephan Hochdoerfer ([@shochdoerfer](https://twitter.com/shochdoerfer))

- "new" is the new evil. Try not to instantiate classes from within classes
- pass dependencies on construction
- Use a Container class to inject Dependencies into the Consumer
- Make injection configurable externally (like json/xml/php file).
  You don't have to touch the code to alter the behavior at all

### Embracing Constraints with CouchDB - David Zuelke ([@dzuelke](https://twitter.com/dzuelke))

Turned out to be more of an introduction than I was hoping for.
Much like [@jchris](https://twitter.com/jchris)' talk at Kings of Code last year but with less Hovercrafts in the slides :)
More 'your mom' jokes though. Some points that were made:

- CouchDB is not always awesome. Any data that is tabular or relational is better
  of with a relational DB.
- CAP is the triangle between Consistency, Availability, Partition tolerance.
  You can only have 2 and Couch offers: Availability & Partition tolerance.
  CouchDB isn't inconsistent, it's eventually consistent.
  However, your mom is inconsistent.
- Tool: `rlwrap http_console` is a great way of talking to HTTP servers.
- Couch works well with Lucene. Use it for complex querying.

One interesting use case was storing XML documents. Every sub-root-level tag would just be a json-key in your DB.
It would allow you to link & search on XML tag contents without parsing the entire thing every time.

### Iphone Apps in HTML5 - Thorsten Rinne

- Webkit does HTML5 so we get things like: video tags, CSS3 text shadows, local storage.
- We can style an iPhone app with jQuery: [jqTouch](https://jqtouch.com)
- [Phonegap](https://phonegap.com) packs HTML pages as apps. unlocks iPhone features like: geolocation,
  contacts, and vibrate with a simple JavaScript API.
- Phonegap can be forked and they'll accept your patches
- Phonegap already has approval from Apple; objective-c & javascript is allowed.
- You need: [xcode, iPhone SDK](https://developer.apple.com), $99 / year fee to AppStore
- Titanium is an alternative to Phonegap. but with less platforms supported

### PHP Code Review - Sebastian Bergmann ([@s\_bergmann](https://twitter.com/s_bergmann))

- It's better to have a tool tell you your code is bad, than a person
- toolbased reviews: Review Board, Atlassion Crucible (commercial)
- [phploc](https://github.com/sebastianbergmann/phploc) analyzes code complexity and such
- [pdepend](https://pdepend.org) generates a pyramid png of metrics & dependencies
- coderank is like pagerank but for code
- [phpcs](https://pear.php.net/package/PHP-CodeSniffer "PHP Code Sniffer") now support more sniffs than just formatting: bug patterns, unintentional things, performance, patterns
- [phpcpd](https://github.com/sebastianbergmann/phpcpd) detects duplicate code
- Read the Refactoring book by Martin Fowler!

### The Future of PHP - Scott MacVicar ([@scottmac](https://twitter.com/scottmac)), Sebastian Bergmann ([@s\_bergmann](https://twitter.com/s_bergmann)), Matthew Weier O'Phinney ([@weierophinney](https://twitter.com/weierophinney))

- Zend Framework 2 is going to be namespaced
- Frameworks are starting to use the same standards: code conventions, namespaces, phpunit,
  making it much easier to exchange components.
- No "PHP 7 Ultimate Edition", 5.3.999 contains all the nitty gritty now
  (scalar type hinting, traits, unicode safe string functions, dtrace).
  A new version name will be decided on later.
- [xhprof](https://github.com/facebook/xhprof) is performance profiling that's ok to run in production.
- It takes [hiphop](https://github.com/facebook/hiphop-php) 400 CPUs to compile Facebook's code in 16 minutes.
- Find a good tutorial, and solve 1 real world problem in another language. Screw hello world.

## Concluding

As far as best practices I didn't hear a lot of new stuff, but it's always good
to hear more speakers command you to obey them :)

I was especially pleased with some tools I came to know (mostly in [@s\_bergmann](https://twitter.com/s_bergmann)'s talk),
and [@shochdoerfer](https://twitter.com/shochdoerfer)'s sledgehammer approach to defeat untestable legacy code.

Thorsten Rinne's iPhone talk  in the [Unconference Room](https://phpconference.nl/schedule/unconference) ([@dpc\_uncon](https://twitter.com/dpc_uncon)) was refreshing & I definitely
need to get my hands dirty with that.

All in all a pretty good conference and I look forward to reading up on all the talks I couldn't attend.

**What's the coolest thing you picked up at the DPC, or another recent conference?**
]]></content:encoded>
      <dc:date>2010-06-20T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Analyze HTTP Requests With TShark</title>
      <link>https://kvz.io/analyze-http-requests-with-tshark.html</link>
      <description><![CDATA[When you're debugging a tough problem you sometimes need to analyze the
HTTP traffic flowing between your machine and a webserver or proxy.
Sometimes you can use firebug or chrome inspector for that. But here's a
lowlevel alternative that I'm pretty excited about. Meet Tshark.
]]></description>
      <pubDate>Sat, 15 May 2010 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/analyze-http-requests-with-tshark.html</guid>
      <content:encoded><![CDATA[When you're debugging a tough problem you sometimes need to analyze the
HTTP traffic flowing between your machine and a webserver or proxy.
Sometimes you can use firebug or chrome inspector for that. But here's a
lowlevel alternative that I'm pretty excited about. Meet Tshark.

<!--more-->

Because it's low level, it will run nicely in a separate console.
And it will catch *any* request. That can be useful when you want to find out
what 3rd party apps are communicating.
In my case it was a Flash app that we assumed didn't
respect some redirect headers while downloading static files.
Since it had it's own HTTP implementation, firebug was unable to shed
any light on the matter.

I knew tcpdump but was never really happy with it.
And then I found TShark.

## Install

On Ubuntu I typed:

```bash
$ aptitude install tshark
```

But I found implementations for [other systems](https://www.wireshark.org/download.html) as well.

## Sniff HTTP Requests

Tshark can analyze any kind of network traffic, but in my case I was particularly
helped by a command I found on [stackoverflow](https://serverfault.com/questions/84750/monitoring-http-traffic-using-tcpdump/84777#84777):

```bash
$ tshark 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' -R 'http.request.method == "GET" || http.request.method == "HEAD"'
```

Run that, and browsing to google will dump:

```bash
190.302141 192.168.0.199 -> 74.125.77.104 HTTP GET / HTTP/1.1
190.331454 192.168.0.199 -> 74.125.77.104 HTTP GET /intl/en_com/images/srpr/logo1w.png HTTP/1.1
190.353211 192.168.0.199 -> 74.125.77.104 HTTP GET /images/srpr/nav_logo13.png HTTP/1.1
190.400350 192.168.0.199 -> 74.125.77.100 HTTP GET /generate_204 HTTP/1.1
```

Nice and clean.

## Go Crazy

The above was all I needed, but I soon found examples
that demonstrate some other capabilites.

### Count GIF Images Based on Content Type

The command below counts the number of GIF images downloaded through HTTP
(from [codealias](https://www.codealias.info/technotes/the-tshark-capture-and-filter-example-page)):

```bash
$ tshark -R 'http.response and http.content_type contains image' \
  -z 'proto,colinfo,http.content_length,http.content_length' \
  -z 'proto,colinfo,http.content_type,http.content_type' \
  -r /tmp/capture.tmp | grep 'image/gif' | wc -l
```

### Log All POP Users

The command below captures all port 110 traffic and filters out the 'user'
command and saves it to a text file (from [Mark's notes](https://medgarnet.blogspot.com/2007/10/tshark-filter-example.html)):

```bash
$ tshark -i 2 -f 'port 110' -R 'pop.request.parameter contains 'user'' > /tmp/pop_users.txt
```

### Log HTTP Request / Receive Headers

One from [superuser](https://superuser.com/questions/178031/how-do-i-return-just-the-http-header-from-tshark)

```bash
$ tshark tcp port 80 or tcp port 443 -V -R "http.request || http.response"
```

Ok that's it for now.
If you have some juicy tshark commands yourself, just post a comment
and I'll update the article.
]]></content:encoded>
      <dc:date>2010-05-15T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Convert All Tables to InnoDB</title>
      <link>https://kvz.io/convert-all-tables-to-innodb-in-one-go.html</link>
      <description><![CDATA[Some time ago I was in the situation where I was looking at 200 MyISAM tables
screaming to get converted to InnoDB for performance reasons.
You probably know that MyISAM is better at full-text searches and such,
but what I needed was this database stop locking entire tables when I was
just doing row-level interactions. Here's how I did in one go.
]]></description>
      <pubDate>Tue, 27 Apr 2010 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/convert-all-tables-to-innodb-in-one-go.html</guid>
      <content:encoded><![CDATA[Some time ago I was in the situation where I was looking at 200 MyISAM tables
screaming to get converted to InnoDB for performance reasons.
You probably know that MyISAM is better at full-text searches and such,
but what I needed was this database stop locking entire tables when I was
just doing row-level interactions. Here's how I did **in one go**.

<!--more-->

I'm not the kind of guy who's going to spend 3 hours & 600 mouseclicks
in phpmyadmin. So this needed to be automated.

## Check Your Engines

To find out what tables currently use for their storage engine, execute:

```bash
$ DATABASENAME="kvz"

$ echo "SELECT TABLE_NAME, ENGINE
  FROM information_schema.TABLES
  WHERE TABLE_SCHEMA = '${DATABASENAME}';" | mysql --defaults-file=/etc/mysql/debian.cnf
```

## Dryrun

To see what MySQL commands are going to be executed, you can safely type
this:

```bash
$ DATABASENAME="kvz"

$ echo 'SHOW TABLES;' \
 | mysql --defaults-file=/etc/mysql/debian.cnf ${DATABASENAME} \
 | awk '!/^Tables_in_/ {print "ALTER TABLE `"$0"` ENGINE = InnoDB;"}' \
 | column -t
```

Please change the `DATABASENAME`.

As you can see Ubuntu has - thanks to Debian - the
`/etc/mysq/debian.cnf` file so you [don't even need a password](/blog/2010/03/21/access-mysql-without-password/ "Access MySQL without password")
(that's only after you are `root` of course).

How cool is that.

Ok, on to the fun part.

## Warning

To any inexperienced sysadmin reading this, I would have to
make it clear to:

- Investigate if you need this & if your DB is compatible
- First test on a replica on another machine
- Make backups
- Plan for downtime
- Use at own risk
- And most important:
- Not come crying to me that I wrecked your DB :)

## Execute

Now that you have taken all the necessary precautions, here's how to
feed the commands from **Dryrun** back to MySQL again:

```bash
$ DATABASENAME="kvz"

$ echo 'SHOW TABLES;' \
 | mysql --defaults-file=/etc/mysql/debian.cnf ${DATABASENAME} \
 | awk '!/^Tables_in_/ {print "ALTER TABLE `"$0"` ENGINE = InnoDB;"}' \
 | column -t \
 | mysql --defaults-file=/etc/mysql/debian.cnf ${DATABASENAME}
```

Depending on the size of your tables this may take a while.
But by the end of it, you'll have an InnoDB-only database.
Nice.

### Alternative 1

Here's another way as suggested by Tim in the comments:

```bash
$ DATABASENAME="kvz"

$ for t in `echo "show tables" | mysql --batch --skip-column-names $DATABASENAME`; do mysql $DATABASENAME -e "ALTER TABLE \`$t\` ENGINE = InnoDB;"; done
```

His use of `--skip-column-names --batch` is especially noteworthy,
this way we could lose the `awk` matching for `Tables\_in\_`, which
makes it more robust (what if that string changes in a future version).

### Alternative 2

This one is just in from Bob Sikkema, he mentions

For MySQL 5.5:

```sql
SELECT CONCAT('ALTER TABLE ',table_schema,'.',table_name,' ENGINE=InnoDB;')
FROM information_schema.tables
WHERE 1=1
    AND engine = 'MyISAM'
    AND table_schema NOT IN ('information_schema', 'mysql', 'performance_schema');
```

Version for MySQL prior to MySQL 5.5

```sql
SELECT CONCAT('ALTER TABLE ',table_schema,'.',table_name,' ENGINE=InnoDB;')
FROM information_schema.tables
WHERE 1=1
    AND engine = 'MyISAM'
    AND table_schema NOT IN ('information_schema', 'mysql');
```

Using the output from the query, you have a conversion script for the slave.
Tables that have FULLTEXT indexes cannot be converted to MyISAM. To locate, run this query:

```sql
SELECT
    tbl.table_schema,
    tbl.table_name
FROM (
    SELECT
        table_schema,
        table_name
    FROM information_schema.tables
    WHERE 1=1
        AND engine = 'MyISAM'
        AND table_schema NOT IN ('information_schema', 'mysql')
) tbl
INNER JOIN
(
    SELECT
        table_schema,
        table_name
    FROM information_schema.statistics
    WHERE index_type = 'FULLTEXT'
) ndx
USING (table_schema, table_name);
```

## Other Purposes

Obviously these same methods could be used to convert all MySQL tables to
MyISAM, change the encoding of all MySQL tables to UTF8, etc.

Share your thoughts & alternatives!
]]></content:encoded>
      <dc:date>2010-04-27T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Redis PHP Introduction</title>
      <link>https://kvz.io/redis-in-php.html</link>
      <description><![CDATA[Don't know Redis? Think Memcache, with support for
for lists, and  disk-based storage.
You can use Redis as a database, queue, cache server or all of those combined.
Let's see how you can use this power in your PHP apps.
]]></description>
      <pubDate>Thu, 25 Mar 2010 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/redis-in-php.html</guid>
      <content:encoded><![CDATA[Don't know [Redis](https://code.google.com/p/redis/ "A persistent key-value database with built-in net interface written in ANSI-C for Posix systems")? Think Memcache, with support for
for **lists**, and  **disk**-based storage.
You can use Redis as a database, queue, cache server or all of those combined.
Let's see how you can use this power in your PHP apps.

<!--more-->

About those **disk**s.
Well, Redis keeps the entire dataset in memory, so it's
still crazy fast:
110000 SETs/second, 81000 GETs/second. Good enough for to you?

"..and from time to time the data is saved on disk asynchronously (semi
persistent mode) or alternatively every change is written into an append
only file (fully persistent mode). Redis is able to rebuild the append
only file in background when it gets too big."

About those **list**s. Yes, you can store (serialized) arrays in Memcache.
But everytime you change 1 element, you'd have to invalidate & overwrite
the entire array.
Clearly very inefficient, yet a commonly faced problem.

About this **article**. It's meant to help PHP developers taking their first steps
into the Redis world. From here on, there are plenty of other resources
online to dig deeper.

## Install Redis Server

Simple on Ubuntu / Debian:

```bash
$ aptitude install redis-server
```

## Install PHP Library

There are many different PHP client implementations.
I'd like to recommend [Rediska](https://rediska.geometria-lab.net/ "Rediska").
It's feature complete and a true pleasure to work with.
This code will pull in the latest Rediska source, and copy it to
`/usr/share/php/` so you can include it rightaway.

```bash
$ cd /usr/src
$ [ -d Rediska ] || git clone git://github.com/Shumkov/Rediska.git
$ cd Rediska && git pull && rsync -a ./library/ /usr/share/php/
```

Zend Framework users may want to take a [different approach](https://rediska.geometria-lab.net/documentation/integration-with-frameworks/zend-framework "Integrate Redis with Zend Framework")

## Code

Alright, we are ready to start saving some data.

Let me show you 4 common Redis datatypes to work with:
`Keys`, `Lists`, `Sets` and `Sorted Sets`.

### Keys

In PHP syntax, a `Key` could be thought of as:

```php
<?php
$firstname = 'kevin';
?>
```

Ok, let's save something into Redis.
First, initialize a 'firstname' Key:

```php
<?php
require_once 'Rediska/Key.php';
$Key = new Rediska_Key('firstname');
?>
```

Now let's give it the value 'kevin':

```php
<?php
$Key->setValue('kevin');
?>
```

It's nice to know that `->setValue()` **instantly saves** 'kevin' to
Redis memory. And you don't have to worry about every losing this name,
cause Redis will automatically save to disk afterwards.

Basically that's it. You've saved something in Redis!

At a later point in time you can always retrieve that name by
(initializing the 'firstname' `Key` if you haven't already and) doing a:

```php
<?php
echo $Key->getValue();
?>
```

### Lists

`Lists` are 'collections of unsorted elements'.

In PHP syntax, a `List` could be thought of as a simple array:

```php
<?php
// Names
$list = array(
    'kevin',
    'john',
);
?>
```

Adding new elements to a Redis `Lists` happens in realtime and at constant
speed.
Meaning that adding an item to a 10 elements `List`, happens at the same speed
as adding an element to the to a 10 million elements `List`.

So that's the upside.

The downside of this, is that looking up items by index is less fast.

So use Redis `Lists` every time you require to access data in the same order
they are added.
A message queue would make a perfect `List`.
Also see [Redis Data Types](https://code.google.com/p/redis/wiki/IntroductionToRedisDataTypes "Introduction to Redis Data Types")

So how do you work with Redis `Lists` in PHP?

```php
<?php
// Init
require_once 'Rediska/Key/List.php';
$List = new Rediska_List('names');

// Set
$List->append('kevin');
$List[] = 'john'; // Also works

// Get (this could be done at any time, by any process,
// just initialize the List again)
foreach ($List as $name) {
    echo $name . "\n";
}
?>
```

Easy right? Next up: `Sets`

### Sets

`Sets` are 'collections of unique unsorted elements'.
You can think at this as a hash table where all the values are `true`.
So basically, the values don't really matter in this and the keys
are important.

In PHP syntax, a `Set` could be thought of as:

```php
<?php
// Names
$set = array(
    'kevin' => true,
    'john' => true,
);
?>
```

Because Redis now adds items as keys, they will be unique and you can perform
operations on `Sets` that you can't on `Lists` such as:

- Testing if a given element already exists
- performing the intersection
- union
- difference between multiple `Sets`
- and so forth

Now let's see how you can work with `Sets`:

```php
<?php
// Init
require_once 'Rediska/Key/Set.php';
$Set = new Rediska_Set('names');

// Set
$Set->add('kevin');
$Set[] = 'john'; // Also works
$Set->add('john'); // Still only 1 'john' in the set

// Get
foreach ($Set as $name) {
    echo $name;
}
?>
```

### Sorted Sets

Are always ordered by their 'score'. This is also how they are stored.
So any time you retrieve such a `Set`, it's already sorted no matter
what you have added.

In PHP syntax, a `Sorted Set` could be thought of as:

```php
<?php
// Names with birthyears
$zset = array(
    'john' => 1979,
    'kevin' => 1983,
);
?>
```

If we start adding more names & birthyears, old people will automatically
be stored on top. Young at the bottom.

Using `Sorted Sets` in PHP is ridiculously easy:

```php
<?php
// Init
require_once 'Rediska/Key/SortedSet.php';
$ZSet = new Rediska_Key_SortedSet('birthyears');

// Set
$ZSet['kevin'] = 1983;
$ZSet->add('john', 1979); // Also works

// Get
foreach ($ZSet as $name) {
    echo $name . " was born in ";
    echo $ZSet->getScore($name) . "\n";
}
?>
```

## Backup

Before starting to use this in production, you want to know how you can
keep your data safe.

Well, just copy the DB file to a safe place.
On ubuntu the file is located in `/var/lib/redis/`.

`cp`, `rsync` or `scp` will all do the trick. Redis
does active writing in a temp file so you don't have to worry
about data corruption. Checkout the [Redis FAQ](https://code.google.com/p/redis/wiki/FAQ "Some things to know about Redis if you") for more info.

## That's it!

Don't forget to follow Lead Developer [@antirez](https://twitter.com/antirez) on twitter. As of march 2010
VMware is allowing him to work fulltime on Redis, so lot's of juicy
updates to get from there!

This article was based on the following documents:

- [Redis](https://code.google.com/p/redis/ "A persistent key-value database with built-in net interface written in ANSI-C for Posix systems")
- [Faq](https://code.google.com/p/redis/wiki/FAQ "Some things to know about Redis if you")
- [Rediska](https://rediska.geometria-lab.net/ "Rediska")
- [Howto](https://www.paperplanes.de/2009/10/30/how-to-redis.html "How to Redis")
- [Datatypes](https://code.google.com/p/redis/wiki/IntroductionToRedisDataTypes "Introduction to Redis Data Types")
]]></content:encoded>
      <dc:date>2010-03-25T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Access MySQL Without Password</title>
      <link>https://kvz.io/access-mysql-without-password.html</link>
      <description><![CDATA[If you want to do command-line MySQL administration like restoring databases
or dumping statistics, you need the root account and it's password. Or do you?
]]></description>
      <pubDate>Sun, 21 Mar 2010 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/access-mysql-without-password.html</guid>
      <content:encoded><![CDATA[If you want to do command-line MySQL administration like restoring databases
or dumping statistics, you need the root account and it's password. Or do you?

<!--more-->

On Ubuntu / Debian, you don't.
This is usefull if you

- Have lost your password (reset it this way)
- Want to automate tasks (it's a bad idea to pass around root passwords)

## So How Does This Work

Well normally you would access MySQL like this right?

```bash
$ mysql -h localhost -u root -p
# Enter password: XYZ
```

Or you can automate it by supplying it in the same command:

```bash
$ mysql -h localhost -u root -pXYZ
```

Pretty messy if you ask me.

What a lot of people don't know is that there is an account that's used by
debian related systems to restart the service, perform checks, etc.

The account is called `debian-sys-maint`, it's credentials are stored in
`/etc/mysql/debian.cnf` and if you have root access on a database server's shell
you're in business. You can either look up the account, or tell mysql to do it
for you:

```bash
$ mysql --defaults-file=/etc/mysql/debian.cnf
```

And Bob's your uncle.
]]></content:encoded>
      <dc:date>2010-03-21T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>CakePHP and Nginx</title>
      <link>https://kvz.io/cakephp-and-nginx.html</link>
      <description><![CDATA[I still got sites running Apache, but all new projects are launched with
Nginx. I don't need many of the features that Apache offers, and the speed
gain of Nginx is just tremendous. Once you've experienced it, I doubt you'll
want to go back.
]]></description>
      <pubDate>Wed, 24 Feb 2010 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/cakephp-and-nginx.html</guid>
      <content:encoded><![CDATA[I still got sites running Apache, but all new projects are launched with
Nginx. I don't need many of the features that Apache offers, and the speed
gain of Nginx is just tremendous. Once you've experienced it, I doubt you'll
want to go back.

<!--more-->

## Update - May 2nd, 2013

[The official CakePHP documentation](https://book.cakephp.org/2.0/en/installation/url-rewriting.html#pretty-urls-on-nginx) now includes a good Nginx section. This article is hence deprecated and should only be looked at for Cake 1.3 installations.

## Update - Feb 24th, 2013

Chris Hartjes [takes a similar](https://www.littlehart.net/atthekeyboard/2009/01/25/cakephp-nginx-configuration-update/) approach in articles written [way before](https://www.littlehart.net/atthekeyboard/2007/09/14/configuring-cakephp-to-work-with-nginx/) mine, be sure to check it out!

## Continuing

Even though there has been quite some fuzz about Nginx and I bet most of you
have at least heard of it by now, I think the acceptance is still a bit low.

I'd like to help that process along by providing developers a simple yet
effective example.

Maybe you'll play with it on your local box - and eventually decide to go
production. Who knows.

Let me show you how easy it is to get hooked to the power of Nginx.

In this article I'm demonstrating a CakePHP setup, but 1 slight modification
and this applies to pretty much any PHP Framework.

So there are a few differences from your typical Apache setup that I'd like to
highlight.

## Install Nginx

If you're running Ubuntu like me it helps to have Karmic or higher. Then just
type:

```bash
$ aptitude install nginx
```

Other operating systems: it shouldn't be much harder then that, just replace
aptitude with your package manager.

If you find yourself compiling you're on the wrong track, and going to spe  nd
way too much time on this.

Don't forget to shut down any existing web servers you may have.

## PHP FastCGI: spawn-fcgi

In this setup, PHP is daemonized and **keeps running** as a process, listening
to a socket.

Nginx will be configured to pass any \*.php requests to this PHP process.
Normally PHP would have to be fired up all the time. But now it resides in
memory.

To install spawn-fcgi in Ubuntu you'd do:

```bash
$ aptitude install spawn-fcgi php5-cgi

# It doesn't provide a startup script yet. Here's how to get one:
$ curl https://raw.github.com/kvz/deprecated/kvzlib/configs/spawnfcgi_initd -ko /etc/init.d/spawn-fcgi \
 && chmod a+x $_
$ update-rc.d spawn-fcgi defaults
```

Now start it with:

```bash
$ /etc/init.d/spawn-fcgi start
```

Excellent. PHP is listening on port 9000 for incoming Nginx jobs.

You can configure your new PHP install like you're used to.

Only in this directory: `/etc/php5/cgi/` instead of this one `/etc/php5/apache2/`

### Alternative: php-fpm

There also is php-fpm. Pretty much does the same thing, but faster.

Unfortunately at the time of writing I'm experiencing too many crashes for it
to be recommendable. This will probably change soon though.

## HtAccess

It's true that Nginx doesn't support .htaccess, but to be honest: .htaccess
files are the worst.

The additional recursive dir-stats, I/O, & processing involved with every
request, is equal to the exact amount of punches in the face.

That it doesn't support .htaccess does not imply however, that you can't do
rewrites and all the other fancy things you could do with .htaccess files.

Just slightly modify the syntax, and place those new rules in your Nginx
VHost.

As a present, I've already converted the .htaccess rules required to run
CakePHP, and put them in the Nginx VHost example below.

## Nginx Vhost

VHost concept works the same as Apache. Have 1 for every site.

Save it in `/etc/nginx/sites-available/site`. To activate, symlink it to
`/etc/nginx/sites-enabled/site`

run:

```bash
$ /etc/init.d/nginx reload
```

..and you're in business. Use

```bash
$ tail -f /var/log/nginx/*.log
```

to see what's going on.

You don't need to change Nginx's main config, it's tight by default so just
stick with your VHost for now.

Here's what a fully working CakePHP VHost looks like:

```bash
server {
    listen      80;
    server_name example.com www.example.com;
    access_log  /var/log/nginx/example.com.access.log;
    error_log   /var/log/nginx/example.com.error.log;
    rewrite_log on;
    root        /var/www/example.com/app/webroot;
    index       index.php index.html index.htm;

    # Not found this on disk?
    # Feed to CakePHP for further processing!
    if (!-e $request_filename) {
        rewrite ^/(.+)$ /index.php?url=$1 last;
        break;
    }

    # Pass the PHP scripts to FastCGI server
    # listening on 127.0.0.1:9000
    location ~ \.php$ {
        # fastcgi_pass   unix:/tmp/php-fastcgi.sock;
        fastcgi_pass   127.0.0.1:9000;
        fastcgi_index  index.php;
        fastcgi_intercept_errors on; # to support 404s for PHP files not found
        fastcgi_param  SCRIPT_FILENAME $document_root$fastcgi_script_name;
        include        fastcgi_params;
    }

    # Static files.
    # Set expire headers, Turn off access log
    location ~* \favicon.ico$ {
        access_log off;
        expires 1d;
        add_header Cache-Control public;
    }
    location ~ ^/(img|cjs|ccss)/ {
        access_log off;
        expires 7d;
        add_header Cache-Control public;
    }

    # Deny access to .htaccess files,
    # git & svn repositories, etc
    location ~ /(\.ht|\.git|\.svn) {
        deny  all;
    }
}
```

Notes about the example

- [Nginx config](https://wiki.nginx.org/NginxConfiguration) is simple & powerful. If you want you can use if
  statements and put some very basic logic in there.
- In this example the /app/webroot is the document root. Some people may
  have a / as their CakePHP webroot or even /app. But I recommend changing that
  to /app/webroot so you're not exposing any more PHP files then is strictly
  required.
- Notice how this config turns off access log for some static files? How
  cool is that?!
- Checkout how simple it is to set expire headers for different content
  types
- Change example.com to your domainname

## Free Bonus

Here's a VHost if you'd want to use phpmyadmin as installed in by Ubuntu's
Apt:

```bash
server {
    listen       80;
    server_name  phpmyadmin.example.com;
    root         /usr/share/phpmyadmin;
    index        index.php;

    # Add your IP to the allow list!
    location / {
        allow 123.123.123.123;
        deny all;
    }

    location ~ \.php$ {
        index          index.php index.html;
        fastcgi_pass   127.0.0.1:9000;
        fastcgi_index  index.php;
        fastcgi_param  SCRIPT_FILENAME $document_root$fastcgi_script_name;
        fastcgi_intercept_errors on; # to support 404s for PHP files not found
        include        fastcgi_params;
    }
}
```

I'm keeping the larger code blocks on [GitHub](https://github.com/kvz/deprecated/tree/kvzlib/configs/ "Configs on GitHub") so it'll be really easy for you to download, fork etc.

For instance I have another CakePHP configuration that's a bit bigger - shows
you some other stuff you can do with Nginx [here](https://raw.github.com/kvz/deprecated/kvzlib/configs/nginx_vhost_cakephp_2 "Bigger CakePHP Nginx config")

## That's it

Now go out there, have fun, tell me your findings.
]]></content:encoded>
      <dc:date>2010-02-24T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>CakePHP REST Plugin Presentation</title>
      <link>https://kvz.io/cakephp-rest-plugin-presentation.html</link>
      <description><![CDATA[At our company we have a lot of uses for a solid API. We can use it to
distribute config files, have servers report in, let customers edit DNS
records using their own interface, etc.
]]></description>
      <pubDate>Wed, 13 Jan 2010 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/cakephp-rest-plugin-presentation.html</guid>
      <content:encoded><![CDATA[At [our company](https://true.nl "True Together") we have a lot of uses for a solid API. We can use it to
distribute config files, have servers report in, let customers edit DNS
records using their own interface, etc.

<!--more-->

Now that I'm converting all of our legacy code to a big CakePHP application,
the API needed a revisit as well. I chose to use REST as a standard, read
about everything related to Cake & REST, and started hacking on a reusable
plugin. The idea is that you can drop it in any application and unlock
existing functionality to REST with minimal changes to your code.

It is still a work in progress, but as the first Dutch CakePHP event was held
yesterday and I was asked to present something I thought this particular
plugin might be of interest to the community. Here are the slides:

<iframe src="https://www.slideshare.net/slideshow/embed_code/2901872?rel=0" width="512" height="421" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" style="border:1px solid #CCC;border-width:1px 1px 0;margin-bottom:5px" allowfullscreen> </iframe> <div style="margin-bottom:5px"> <strong> <a href="https://www.slideshare.net/kevinvz/rest-presentation-2901872" title="CakePHP REST Plugin" target="_blank">CakePHP REST Plugin</a> </strong> from <strong><a href="https://www.slideshare.net/kevinvz" target="_blank">Kevin van Zonneveld</a></strong> </div>

View on [slideshare](https://www.slideshare.net/kevinvz/rest-presentation-2901872 "Rest Presentation")

And here's the [source & documentation on github](https://github.com/kvz/cakephp-rest-plugin "CakePHP REST plugin")

I would love some feedback to help it make better. My todos can be found in
the slides as well, to give you an idea where I'm heading with this.

## More Info on the Dutch CakePHP Event

- [@jsentel](https://twitter.com/jsentel)'s [writeup on the talks](https://sentel.nl/blog/dutch-cakephp-show-your-code-meetup-borrel)
- [Singel 146's followup (the location in Amsterdam)](https://www.singel146.nl/2010/01/cakephp-borrel-show-your-code-meetup/)
- [@charli3](https://twitter.com/charli3)'s [initial post](https://www.cake-toppings.com/2009/12/30/dutch-borrel-show-your-code-meetup/)
- [#cakephpnl](https://twitter.com/i/#!/search/%23cakephpnl) on twitter
- [LinkedIn group: CakePHP NL](https://www.linkedin.com/groups?gid=2655990)
]]></content:encoded>
      <dc:date>2010-01-13T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Run Node.js as a Service on Ubuntu</title>
      <link>https://kvz.io/run-nodejs-as-a-service-on-ubuntu-karmic.html</link>
      <description><![CDATA[The core of our new project runs on Node.js. With Node you can write
very fast JavaScript programs serverside. It's pretty easy to install Node,
code your program, and run it. But how do you make it run nicely in the
background like a true server?
]]></description>
      <pubDate>Tue, 15 Dec 2009 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/run-nodejs-as-a-service-on-ubuntu-karmic.html</guid>
      <content:encoded><![CDATA[The core of [our new project](https://transload.it/ "Transload.it") runs on [Node.js](https://nodejs.org/ "Node.js"). With Node you can write
very fast JavaScript programs serverside. It's pretty easy to install Node,
code your program, and run it. But how do you make it run nicely in the
background like a true server?

<!--more-->

Clever chaps will have noticed you can just use the '&' like so:

```bash
$ node ./yourprogram.js &
```

and send your program to the background. But:

- if Node ever prints something and your console is closed, the STDOUT no
  longer exists and `yourprogram.js` will die
- what if the process crashes, what if your server reboots?

Ok, so we needed something more robust.
More like a real daemon, one that's recognized by the Operating System as such.

### Upstart

Our servers run Ubuntu's latest: Karmic Koala, which packs a pretty decent
version of [upstart](https://upstart.ubuntu.com/wiki/Stanzas "How to write upstart scripts").
Upstart will eventually replace the well-known
`/etc/init.d` scripts, and will bring some additional advantages to the table
like: speed, health checking, simplicity, etc.

## Writing an Upstart Script

Turns out, writing your own upstart scripts is way easier than building init.d
files based on the `/etc/skeleton` file.

Ok so here's how it looks like; You should store the script in
`/etc/init/yourprogram.conf`, create one for each Node program you write.

```bash
description "node.js server"
author      "kvz - https://kevin.vanzonneveld.net"

# Used to Be: Start on Startup
# until we found some mounts weren't ready yet while booting:
start on started mountall
stop on shutdown

# Automatically Respawn:
respawn
respawn limit 99 5

script
    # Not sure why $HOME is needed, but we found that it is:
    export HOME="/root"

    exec /usr/local/bin/node /where/yourprogram.js >> /var/log/node.log 2>&1
end script

post-start script
   # Optionally put a script here that will notifiy you node has (re)started
   # /root/bin/hoptoad.sh "node.js has started!"
end script
```

Wow how easy was that? Told you, upstart scripts are childsplay. In fact
they're so compact, you may find yourself changing almost every line cause
they contain specifics to our environment.

### Non-Root

Node can do a lot of stuff. Or break it if you're not careful. So you may want
to run it as a user with limited privileges. We decided to go conventional and
chose `www-data`.

We found the easiest way was to prepend the Node executable with a sudo like
this:

```bash
exec sudo -u www-data /usr/local/bin/node
```

Don't forget to change your export HOME accordingly.

## Restarting Your Node.js Daemon

This is so ridiculously easy..

```bash
$ start yourprogram
$ stop yourprogram
```

And yes, Node will already:

- automatically start at boottime
- log to `/var/log/node.log`

..that's been defined inside our upstart script.

### initctl

But wait, `start` and `stop` are just shortcuts. Who's really behind the wheel
here, is `initctl`. You can play around with the command to see what other
possibilities there are:

```bash
$ initctl help
$ initctl status yourprogram
$ initctl reload yourprogram
$ initctl start yourprogram # yes, this is the same start
# etc
```

## Update from October 30th, 2012

The basic idea has not changed since 2009, but we did add some tricks to our
upstart script. Here's what we now use in production
at [transloadit.com](https://transloadit.com):

```bash
# cat /etc/init/transloaditapi2.conf
# https://upstart.ubuntu.com/wiki/Stanzas

description "Transloadit.com node.js API 2"
author      "kvz"

stop on shutdown
respawn
respawn limit 20 5

# Max open files are @ 1024 by default. Bit few.
limit nofile 32768 32768

script
  set -e
  mkfifo /tmp/api2-log-fifo
  ( logger -t api2 </tmp/api2-log-fifo & )
  exec >/tmp/api2-log-fifo
  rm /tmp/api2-log-fifo
  exec sudo -u www-data MASTERKEY=`cat /transloadit/keys/masterkey` /transloadit/bin/server 2>&1
end script

post-start script
   /transloadit/bin/notify.sh 'API2 Just started'
end script
```

## More on Node.js

With Node you can write very fast JavaScript programs serverside. We've seen
[examples](https://wiki.github.com/ry/node "Node wiki on GitHub shows links to many Node projects") of chat, key-value store, and full blown http servers. Basically
anything is possible as long as you know JavaScript and the concepts of
parallel/evented processing. You don't? Well if you've ever used
`setTimeout()`, you'll soon get the hang of it ; )

- [Node.js video presentation](https://jsconf.eu/2009/video-nodejs-by-ryan-dahl.html) by creator Ryan Dahl
- [Node.js slides](https://nodejs.org/jsconf.pdf) that accompany the presentation
- [About Node](https://nodejs.org/#about) on the official website
- [Node.js is genuinely exciting](https://blog.simonwillison.net/post/57956855516/node) by Simon Willison
- [node.js](https://debuggable.com/posts/node-js:4ab4d9d7-b788-41d4-85c0-1b51cbdd56cb) by Debuggable
]]></content:encoded>
      <dc:date>2009-12-15T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Git Migration - Remove Passwords From History</title>
      <link>https://kvz.io/git-migration-remove-passwords-from-history.html</link>
      <description><![CDATA[When migrating projects over to GitHub, I found there were still some
passwords inside my SVN repositories. Obviously it's not good practice to
store your passwords in a code repository - let alone at a remote location, so
I wanted to replace all passwords. Not only in the current version, but in all
commits that have been made over the past 3 years. Luckily with Git - you can.
]]></description>
      <pubDate>Sun, 08 Nov 2009 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/git-migration-remove-passwords-from-history.html</guid>
      <content:encoded><![CDATA[When migrating projects over to GitHub, I found there were still some
passwords inside my SVN repositories. Obviously it's not good practice to
store your passwords in a code repository - let alone at a remote location, so
I wanted to replace all passwords. Not only in the current version, but in all
commits that have been made over the past 3 years. Luckily with Git - you can.

<!--more-->

Now, there is a guide to [Remove sensitive data](https://help.github.com/removing-sensitive-data/ "GitHub Help") on GitHub; but that
removes files completely.

I wanted to preserve the files and just replace the passwords in Git history.

So my plan was to:

- Create GitHub accounts for every SVN comitter
- Store the SVN&lt;>GitHub account mapping in `~/.authors`
- Checkout SVN tree as a local Git repo (using `git-svn`)
- Go over all the commits and replace all passwords with `xXxXxXxXxXx`
- Go over all code in the HEAD - the current version of the project


- find `xXxXxXxXxXx`
- replace with `App::config('Database.main.password')`
- Have `App::config` take the password from a config file that's outside
  the repository

Now that I have a working HEAD without real passwords or `xXxXxXxXxXx`, and a
lot of previous versions with just `xXxXxXxXxXx` in them:

- Send it to GitHub
- Continue leading a happy life without worries.

Here are the commands I ended up using:

```bash
# Sample starts here
# Import from SVN
cd ${HOME}/workspace
git svn clone --authors-file=${HOME}/.authors svn://svn.example.com/projectX/trunk projectX

cd projectX

# Rewrite history
git filter-branch --tree-filter 'git ls-files -z "*.php" |xargs -0 perl -p -i -e "s#(PASSWORD1|PASSWORD2|PASSWORD3)#xXxXxXxXxXx#g"' -- --all

# Make workspace look like HEAD
git reset --hard

# Try to recompress and clean up, then check the new size
git gc --aggressive --prune

# To GitHub
git remote add origin git@github.com:kvz/projectX.git
git push origin master
```

Lookout for these keywords as you'll have to substitute them with your own:

- projectX
- example.com
- kvz
- .authors
- PASSWORD1
- PASSWORD2
- PASSWORD3

**Warning!** Rewriting history Can be Dangerous! :)

Seriously though.. Be absolutely sure you know what you're doing and make
backups before doing anything.
]]></content:encoded>
      <dc:date>2009-11-08T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Generate HTML With PHP</title>
      <link>https://kvz.io/generate-html-with-php.html</link>
      <description><![CDATA[Hi. Have you met KvzHTML? It's a standalone PHP Class for generating HTML.
]]></description>
      <pubDate>Sun, 11 Oct 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/generate-html-with-php.html</guid>
      <content:encoded><![CDATA[Hi. Have you met KvzHTML? It's a standalone PHP Class for generating HTML.

It's been hiding deep inside the caverns of my secret GitHub repo: [kvzlib](https://github.com/kvz/deprecated/tree/kvzlib "kvzlib")

- a collection of code snippets too small or unfinished to deserve their own
  repository. But I find working with this class so pleasant, I thought I'd
  share the fun.

<!--more-->

KvzHTML (yeah, not the most imaginative name, sorry for that) will generate
regular HTML or XML, it just makes the job a bit easier on you because of
these features:

- Automatic Table Of Contents support
- Very compact syntax
- Tag nesting & automatic indentation

Without wasting your time going on & on about it, just let me show you some
examples, and you be the judge.

## Introduction

A basic usage example.

```php
<?php
error_reporting(E_ALL);
require_once 'KvzHTML.php';

// These are the default options, so might
// as well have initialized KvzHTML with an
// empty first argument
$H = new KvzHTML(array(
  'xhtml' => true,
  'track_toc' => false,
  'link_toc' => true,
  'indentation' => 4,
  'newlines' => true,
  'echo' => false,
  'buffer' => false,
  'xml' => false,
  'tidy' => false,
));

echo $H->html(
  $H->head(
      $H->title('My page')
  ) .
  $H->body(
      $H->h1('Important website') .
      $H->p('Welcome to our website.') .
      $H->h2('Users') .
      $H->p('Here\'s a list of current users:') .
      $H->table(
          $H->tr($H->th('id') . $H->th('name') . $H->th('age')) .
          $H->tr($H->td('#1') . $H->td('Kevin van Zonneveld') . $H->td('26')) .
          $H->tr($H->td('#2') . $H->td('Foo Bar') . $H->td('28'))
      )
  )
);
?>
```

### Result

```markup
<html>
    <head>
        <title>
            My page
        </title>
    </head>
    <body>
        <h1>
            Important website
        </h1>
        <p>
            Welcome to our website.
        </p>
        <h2>
            Users
        </h2>
        <p>
            Here's a list of current users:
        </p>
        <table>
            <tr>
                <th>
                    id
                </th>
                <th>
                    name
                </th>
                <th>
                    age
                </th>
            </tr>
            <tr>
                <td>
                    #1
                </td>
                <td>
                    Kevin van Zonneveld
                </td>
                <td>
                    26
                </td>
            </tr>
            <tr>
                <td>
                    #2
                </td>
                <td>
                    Foo Bar
                </td>
                <td>
                    28
                </td>
            </tr>
        </table>
    </body>
</html>
```

## Automatic TOC

Keeping track of a Table of Contents can become quite tedious because every
change you make; you have to make twice. So why not let KvzHTML handle this
for you?

In this example I also show you that

- you don't have to nest KvzHTML functions to nest HTML Elements, per se.

You can open a tag by setting it's body to TRUE, or by leaving it empty.

You can then later close it with ->tag(false)

- You can set echo on if you don't nest KvzHTML functions, so that every
  function will immediately get echoed.
- ..unless you turn on buffering.

all these different options can lead to the same thing. But are there to
reduce your need to type to an absolute minimum.

```php
<?php
error_reporting(E_ALL);
require_once 'KvzHTML.php';

// Some options:
// - create a ToC
// - don't automatically create links for ToC navigation
// - echo output, don't return
// - save all echoed output in a buffer
// - Don't automatically Tidy the output (btw, only works with buffer on)
$E = new KvzHTML(array(
  'track_toc' => true,
  'link_toc' => false,
  'echo' => true,
  'buffer' => true,
  'tidy' => false,
));

$E->h1('New application');
$E->p($E->loremIpsum);

$E->h2('Users');
$E->blockquote($E->loremIpsum);

$E->h3('Permissions');
$E->p($E->loremIpsum);

$E->h4('General Concept');
$E->p($E->loremIpsum);

$E->h4('Exceptions');
$E->p($E->loremIpsum);

$E->h3('Usability');
$E->ul(); // An empty body will just open the tag: <ul>
  $E->li('Point 1');
  $E->li('Point 2');
  $E->li();
      $E->strong('Point 3');
      $E->br(null);  // NULL will make a self closing tag: <br />
      $E->span('Has some implications.');
  $E->li(false);
$E->ul(false);  // False will close the tag: </ul>

// Save both chucks so further KvzHTML calls
// wont impact them anymore
$toc    = $E->getToc();
$document = $E->getBuffer();

// Print a heading that says TOC
$E->h1('Table of Contents', array('__buffer' => false));

// Print toc
echo $toc;

// Print original document
echo $document;
?>
```

### Result

```markup
<h1>
    Table of Contents
</h1>
<a name='toc_root'></a>
<ul>
 <li>New application</li>
 <ul>
  <li>Users</li>
  <ul>
   <li>Permissions</li>
   <ul>
    <li>General Concept</li>
    <li>Exceptions</li>
    </ul>
   </ul>
   <li>Usability</li>
   </ul>
  </ul>
 </ul>
</ul>
<h1>
    New application
</h1>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing
    elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
    ut aliquip ex ea commodo consequat. Duis aute irure dolor in
    reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
    pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa
    qui officia deserunt mollit anim id est laborum
</p>
<h2>
    Users
</h2>
<blockquote>
    Lorem ipsum dolor sit amet, consectetur adipisicing
    elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
    ut aliquip ex ea commodo consequat. Duis aute irure dolor in
    reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
    pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa
    qui officia deserunt mollit anim id est laborum
</blockquote>
<h3>
    Permissions
</h3>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing
    elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
    ut aliquip ex ea commodo consequat. Duis aute irure dolor in
    reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
    pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa
    qui officia deserunt mollit anim id est laborum
</p>
<h4>
    General Concept
</h4>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing
    elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
    ut aliquip ex ea commodo consequat. Duis aute irure dolor in
    reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
    pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa
    qui officia deserunt mollit anim id est laborum
</p>
<h4>
    Exceptions
</h4>
<p>
    Lorem ipsum dolor sit amet, consectetur adipisicing
    elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
    Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
    ut aliquip ex ea commodo consequat. Duis aute irure dolor in
    reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
    pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa
    qui officia deserunt mollit anim id est laborum
</p>
<h3>
    Usability
</h3>
<ul>
<li>
    Point 1
</li>
<li>
    Point 2
</li>
<li>
<strong>
    Point 3
</strong>
<br />
<span>
    Has some implications.
</span>
</li>
</ul>
```

## Generating XML

As an added bonus, you can even make XML documents with KvzHTML

```php
<?php
error_reporting(E_ALL);
require_once 'KvzHTML.php';

$H = new KvzHTML(array(
  'xml' => true,
));

echo $H->xml(
  $H->auth(
      $H->username('kvz') .
      $H->api_key(sha1('xxxxxxxxxxxxxxxx'))
  ) .
  $H->server_reboot(
      $H->dry_run(null) .
      $H->hostname('www1.example.com') .
      $H->server_id(888)
  )
);
?>
```

### Result

```markup
<?xml version='1.0' encoding='UTF-8'?>
<auth>
    <username>kvz</username>
    <api_key>a7a7c2e911a47b967d34b5a8807c040e9d167815</api_key>
</auth>
<server_reboot>
    <dry_run />
    <hostname>www1.example.com</hostname>
    <server_id>888</server_id>
</server_reboot>
```

## Special Functions

I had most fun with KvzHTML while generating HTML that would be [converted to PDF](https://code.google.com/p/wkhtmltopdf/) afterwards. These were massive HTML documents that no designer was
going to touch. So no need to keep the HTML plain for Dreamweaver or anything.

In this light, some of the functions below will make more sense.

```php
<?php
error_reporting(E_ALL);
require_once 'KvzHTML.php';

// I find it easy to work with 2 instances.
//    One that will echo directly: $E
// and One that supports nesting: $H
$H = new KvzHTML();
$E = new KvzHTML(array('echo' => true, 'buffer' => true, 'tidy' => true));

// To save you even more typing. The following tags
// have an inconsistent interface:
// a, img, css, js

$E->html();
  $E->head(
      $H->title('Report') .
      $H->style('
         div.page {
             font-family: helvetica;
             font-size: 12px;
             page-break-after: always;
             min-height: 1220px;
             width: 830px;
         }
     ') .
      $H->css('/css/style.js') .
      $H->js('/js/jquery.js')
  );

  // Page 1
  $E->page(true, array('style' => array(
      'page-break-before' => 'always',
  )));

  $E->h1('Report') .

      $E->p(
          $H->a('https://true.nl', 'Visit our homepage') .
          $H->img('/assets/images/posts/2009-10-11-generate-html-with-php-0.png')
      );

      $E->ul(
          $H->li('Health') .
          $H->li('Uptime') .
          $H->li('Logs') .
          $H->li('Recommendations')
      );
  $E->page(false);

  // Page 2
  $E->page();
      $E->float($H->img('https://en.gravatar.com/userimage/3781109/874501515fabcf6069d64c626cf8e4f6.png'));
      $E->float($H->img('https://en.gravatar.com/userimage/3781109/874501515fabcf6069d64c626cf8e4f6.png'));
      $E->clear();
  $E->page(false);

  // Page 3
  $E->page(
      $H->h2('Warnings') .
      $H->p('Disk space', array('class' => 'warning'))
  );

$E->html(false);

echo $E->getBuffer();
?>
```

### Result

```markup
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
    <head>
        <title>
            Report
        </title>
        <style type="text/css">

                    div.page {
                        font-family: helvetica;
                        font-size: 12px;
                        page-break-after: always;
                        min-height: 1220px;
                        width: 830px;
                    }
        </style>
        <link type='text/css' rel='stylesheet' href='/css/style.js'>
        <script type='text/javascript' src='/js/jquery.js'>
</script>
        <style type="text/css">
div.c3 {clear: both;}
        div.c2 {float: left;}
        div.c1 {page-break-before: always;}
        </style>
    </head>
    <body>
        <div class='page c1'>
            <h1>
                Report
            </h1>
            <p>
                <a href='https://true.nl'>Visit our homepage</a> <img src='/assets/images/posts/2009-10-11-generate-html-with-php-0.png'>
            </p>
            <ul>
                <li>Health
                </li>
                <li>Uptime
                </li>
                <li>Logs
                </li>
                <li>Recommendations
                </li>
            </ul>
        </div>
        <div class='page'>
            <div class='c2'>
                <img src='https://en.gravatar.com/userimage/3781109/874501515fabcf6069d64c626cf8e4f6.png'>
            </div>
            <div class='c2'>
                <img src='https://en.gravatar.com/userimage/3781109/874501515fabcf6069d64c626cf8e4f6.png'>
            </div>
            <div class='c3'></div>
        </div>
        <div class='page'>
            <h2>
                Warnings
            </h2>
            <p class='warning'>
                Disk space
            </p>
        </div>
    </body>
</html>
```

## And There You Have it..

[Get it while it's hot](https://raw.github.com/kvz/deprecated/kvzlib/php/classes/KvzHTML.php "Download KvzHTML"), and if you have improvements: Leave me a comment or even better: GitHub me some patches! :)
]]></content:encoded>
      <dc:date>2009-10-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Svn to Git</title>
      <link>https://kvz.io/svn-to-git.html</link>
      <description><![CDATA[Today I've moved all of my SVN repositories over to GitHub. 5 private reps and
4 public ones. Two of which you may know: PHP.JS and System_Daemon.
]]></description>
      <pubDate>Thu, 03 Sep 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/svn-to-git.html</guid>
      <content:encoded><![CDATA[Today I've moved all of my SVN repositories over to GitHub. 5 private reps and
4 public ones. Two of which you may know: [PHP.JS](https://phpjs.org) and [System\_Daemon](https://pear.php.net/package/System-Daemon).

<!--more-->

Reasons for **not doing **this earlier:

- Not wanting to invest time in (new) Version Control Systems at all - I'd
  rather be coding
- Didn't think Git's features would make a big difference for me:
- Small projects, only a handful of commiters
- Besides SVN, I already had NetBeans' local version control, so I could
  already mess things up and revert, without bothering the main repo.

Because of this it took me a while to take the plunge. But reasons that made
me:

- Saw [some](https://www.youtube.com/watch?v=4XpnKHJAok8) [nice](https://www.youtube.com/watch?v=j45cs5-nY2k&feature=channel) [git video](https://www.youtube.com/watch?v=8dhZ9BXQgc4&NR=1)s during my holidays
- [@felixge](https://twitter.com/felixge) has been [pushing me](/blog/2009/07/15/notes-on-cakefest-3/) for a while :)
- [GitHub.com](https://github.com/)'s tool-set saves a lot of time and opens new opportunities
  (Service Hooks, Collaboration, and even Donation)
- No more need to maintain my own [SVN](https://subversion.tigris.org/) server
- It's easier for people to fork the projects and **submit patches**, which
  should improve the code
- Once you start moving some stuff over, you'd better move all of it over so
  you can make use of git submodules instead of having to wire SVN & Git reps
  together
- **and last but not least:**
  Got fed up with SVN being too delicate and stubborn

Changes

If you're involved in anyway with one of my projects, here are the changes.

The good stuff is moving over [here](https://github.com/kvz), feel free to ask any questions,
report problems, and please remember to update your bookmarks (or even your
svn:externals if you have them referring to my reps)

This also means my [Trac](https://trac.edgewall.org/) sites are going down in favor of GitHub's source
view, issue system & integrated wiki.

### Service Hooks

Furthermore I  created a [@phpjs](https://twitter.com/phpjs) Twitter account so you can track when new
stuff get's added to the [PHP.JS GitHub repository](https://github.com/kvz/phpjs/tree/master) and a [@system\_daemon](https://twitter.com/system_daemon)
account for the same reason.

If you don't like Twitter you can obviously also use GitHub's RSS feeds for
this: [phpjs](https://github.com/feeds/kvz/commits/phpjs/master), [system\_daemon](https://github.com/feeds/kvz/commits/system-daemon/master).
]]></content:encoded>
      <dc:date>2009-09-03T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Flush Memcached Using Bash</title>
      <link>https://kvz.io/flush-memcached-using-bash.html</link>
      <description><![CDATA[If you store application data in memcache, you may want to invalidate it once
you deploy a new version to avoid corruption or weird results. There are
several ways to do this but I recently tried one using nothing but Bash, and I
like it.
]]></description>
      <pubDate>Wed, 19 Aug 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/flush-memcached-using-bash.html</guid>
      <content:encoded><![CDATA[If you store application data in memcache, you may want to invalidate it once
you deploy a new version to avoid corruption or weird results. There are
several ways to do this but I recently tried one using nothing but Bash, and I
like it.

<!--more-->

## Flush Memcache in Bash

Just add this to your deploy script:

```bash
$ echo "flush_all" | /bin/netcat -q 2 127.0.0.1 11211
```

(remember, all entries will be flushed. this is not the way to fly in high
performance environments)

## Bonus: Flush Disk Cache

Also, if you have cache files on **disk**, this is probably one of the best
ways to trash them:

```bash
$ find YOUR/WEB/DIR/app/tmp/cache/ -type f -print0 | xargs -0 rm
```

It's actually a simplified version from what PHP uses to clean up session
garbage files (see `/etc/cron.d/php5`)

What's good about this elaborate approach, is that it deals with

- "**argument list too long**" by using find instead of a `rm *`
- **non-unix characters**

`print0` will delimit files by the `0` character, so you won't have to escape
spaces or any other 'crazy' chars
]]></content:encoded>
      <dc:date>2009-08-19T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Prepare for PHP 5.3</title>
      <link>https://kvz.io/prepare-for-php-53.html</link>
      <description><![CDATA[PHP 5.3 is a big leap forward for PHP and brings of a lot of neat features.
However, big leaps can also mean big changes and potentially big breakage when
it comes to backwards compatibiltiy.
]]></description>
      <pubDate>Wed, 29 Jul 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/prepare-for-php-53.html</guid>
      <content:encoded><![CDATA[PHP 5.3 is a big leap forward for PHP and brings of a lot of neat features.
However, big leaps can also mean big changes and potentially big breakage when
it comes to backwards compatibiltiy.

<!--more-->

I did some experimenting with running a big legacy application and a CakePHP
application on PHP 5.3 and would like to share my findings with you. Here are
a couple of tips to prepare your code for PHP 5.3

## Installing PHP 5.3 on Ubuntu Jaunty

First off, by reading this article you may want to testdrive PHP 5.3 yourself.

If you haven't tried PHP 5.3 yet and you want to give it a go right now, but
you don't want to mess up any Production or even Development environment,
[Virtualization can help you](/blog/2008/10/06/how-virtualization-will-improve-your-code/) with that.

Just

- Launch a virtual Ubuntu Jaunty instance (desktop versions work too, and
  may be easier on you)
- Use the [dotdeb.org repositories](https://www.dotdeb.org/2009/07/03/php-5-3-0-final-preview-packages-available-for-debian-lenny/) to install PHP 5.3 (thanks to
  sysadmin guru Martijn Kint for that great find)
- Point Apache's (or nginx or whatever) document root to your workstation's
  shared code directory.

Et voila. An independent test platform that's able to reflect your code
changes in real time.

Now on to my findings.

## Short Tags

When you do a fresh PHP 5.3 install on an Ubuntu Jaunty server, short tags,
also known as: `<?`

New releases can have support off by default, hence just showing code as
plaintext. Including db passwords.

This is the stuff that can get you fired.

So if you have these short tags, lets get rid of them. But how, there are so
many!

**Update #1** - As noted by Rune Kaagaard and Philip Olson in the comments,
support for short tags is still a subject of discussion. Whether it will be
*turned off by default* or *supported in PHP 6* also depends on the package
maintainers of each distribution (looks like Ubuntu is going for the strict
approach).

### Regex to the Rescue?

If you have a lot of short tags, the following regexes could help to convert
them:

```bash
replace: '<\?='
   with: '<\?php echo '

replace: '<\?(?!php|xml)'
   with: '<\?php'
```

The order of these replace actions is important.

Now, be sure to **go over the changes manually** with a tool like
[regexxer](https://regexxer.sourceforge.net/) though. Cause if you're not careful code generators,
highlighters & XML tools will break. Consider the following real life example:

```php
<?php
$CodeRow->replace('<?', '<?php', 'T_OPEN_TAG');
?>
```

By our replace action, this would now read:

```php
<?php
$CodeRow->replace('<?php', '<?php', 'T_OPEN_TAG');
?>
```

Making that line completely useless and introducing a bug. So please, really
go over the changes manually. Regexxer will make that job less painful
already.

**Update #2:** As Bernhard [mentions in the comments](/blog/2009/07/29/prepare-for-php-53/ "Convert shorttags PHP"), there is a much better way of removing shorttags from your PHP code.

### After the Changes

Test your code thoroughly before you commit.

## PHP Deprecated Warnings

Some 'old' syntax still works but generates "Deprecated" warnings. These are
telling you to change your code before it's too late: future versions of PHP
will no longer support it. At all.

The following example is still used in quite some code, but as of PHP 5.3 will
generate "Deprecated" warnings.

```php
<?php
$log = & new Logger('file', '/var/log/app.log');
?>
```

They want you to get rid of the '&'.

Obviously - if you are the author of such lines - you should **change your code**. Now is the time.

But let's say you also use a framework (CakePHP in my case), and you are not
the author. You need to wait for others to fix it (don't worry, PHP 5.3
releases of Cake are in the works, just not stable yet). But if you still want
to run PHP 5.3 right now with a deprecated framework codebase you can't
change: you can **turn off these warnings** by changing the level of error
reporting.

Here's an example that still let's you see other debug messages like Notices
(which is great during development), but just turns off the Deprecated
warnings (which are useless **if they concern the framework's code**):

```php
<?php
error_reporting(E_ALL & ~ E_DEPRECATED);
?>
```

.. basically saying: Show all warnings except deprecated.

So put this wherever you currently set the error\_reporting level.

For CakePHP 1.2, I found I had to hack my core in
`/cake/libs/configure.php` at line `#291` to get rid of the deprecated warnings.

## MySQL

In Ubuntu (thanks Philip Olson), PHP 5.3 uses a native MySQL driver
([mysqlnd](https://dev.mysql.com/downloads/connector/php-mysqlnd/)) and it enforces strong passwords. We still had some legacy
16bit MySQL passwords and so [MySQL guru Erwin Bleeker](https://www.google.com/reader/shared/02111691477068722848) upgraded them in
order for mysql\_connect to work. Good thing I suppose ; )

## Extensions

I had some extensions like php5-xdebug, php5-xcache, php5-memcache that I
previously installed with APT package management, but now gave version
conflicts with my custom dotdeb.org packages.

Well, just uninstall them with apt (or rpm, or whatever) and reinstall them
with pecl like this:

```bash
$ pecl install -f memcache
```

So that the extensions are rebuild for your current PHP version and the
conflicts are resolved. Make sure your php.ini files point to the right .so
files though.

## User Contributions

In the comments there are developers who ran into other issues as well.
Summarizing:

- Many array functions that used to allow passing objects no longer do - Matthew Weier O'Phinney
- Convert `ereg_` functions to `preg_` ones
- `magic_quotes_gpc` and `register_globals` are gone - Giorgio Sironi

## Alright

OK so after those changes, all of my old code ran fine on PHP 5.3 and it was
time to really go & enjoy the new features like namespaces, closures, inreased
performance & late static binding! Awesome..

I'm curious. **What problems did you encounter trying to run old code on PHP 5.3? Where you able to fix it?** Please share your experience.
]]></content:encoded>
      <dc:date>2009-07-29T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Notes on CakeFest 3</title>
      <link>https://kvz.io/notes-on-cakefest-3.html</link>
      <description><![CDATA[Looking back at a great CakeFest in Berlin, I learned a lot about CakePHP and
met many nice and inspiring people. Here are some conference notes I took that
where particularly useful or new to me.
]]></description>
      <pubDate>Wed, 15 Jul 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/notes-on-cakefest-3.html</guid>
      <content:encoded><![CDATA[Looking back at a great CakeFest in Berlin, I learned a lot about CakePHP and
met many nice and inspiring people. Here are some conference notes I took that
where particularly useful or new to me.

<!--more-->

These notes are not a representative summary of the event's highlights. For
that you may want to checkout the resources at the end of the article, or just
attend the next CakeFest.

## Best Code Practices

- Think about making it into a **datasource**. Now think again. Now really
  hard.

Clue: Can you make atomic CRUD operations on it? Does it interface with the
outside world? It's probably a datasource.

- Use [TCPDF](https://www.tecnick.com/public/code/cp-dpage.php?aiocp-dp=tcpdf) on sourceforge for generating **PDF** files
- Try [Media Plugin](https://wiki.github.com/davidpersson/media/) for any kind of media upload, it protects you from
  all known vulnerabilites & caveats and will save you work.
- Follow the view folder logic inside /app/webroot/js directory as well


- Use rules with wildcards to explain what JavaScript files need to be
  included at what pages
- Obviously true for CSS as well

(Thanks to: [@NOSLOW](https://twitter.com/NOSLOW), [@jperras](https://twitter.com/jperras), [@felixge](https://twitter.com/felixge), [@nperson](https://twitter.com/nperson))

## Tools, Methods & Task Management

- Use "[Club Mate](https://en.wikipedia.org/wiki/Club-Mate)" to code 48 hours a day ; )
- Checkout [OmniFocus](https://www.omnigroup.com/applications/omnifocus/) for tasks
- [Remember the milk](https://www.rememberthemilk.com/) could work
- [Sequel Pro](https://www.sequelpro.com/) for Mac **MySQL** management
- Sharpen your skills & keep your eye on the ball with [pair
  programming](https://en.wikipedia.org/wiki/Pair-programming), programmers will tend to get lost on their own
- [The Pomodoro Technique](https://www.pomodorotechnique.com/) could also help you getting things done
  without investing/wasting too much time with the methodology itself.
- Outside the zone, you're working at 10% of your potential. It takes 50
  minutes to get into the zone. It takes 2 seconds to get out. Manage
  distractions.

(Thanks to: [@felixge](https://twitter.com/felixge), [@gwoo](https://twitter.com/gwoo), [@alkemann](https://twitter.com/alkemann))

## Cake Core

- [Cake 3](https://code.cakephp.org/cake3) is going to be very powerful, we like it. We want it. Now.
  pretty pretty please [@nateabele](https://twitter.com/nateabele)?
- Why not look at Cake's testcases for **additional** understanding &
  documentation
- If there are multiple validation errors, the last rule is displayed. So
  swap them until they make sense from a usability standpoint.
- Should have known that element() has built in caching. Just use the
  *cache* parameter.
- We're getting a nice Plugin Repository. Probably even one with CLI tools
  so we could just type 'install plugin x'

(Thanks to: [@nateabele](https://twitter.com/nateabele), [@jperras](https://twitter.com/jperras), [@gwoo](https://twitter.com/gwoo))

## Performance & Benching

- Use [Lucene](https://lucene.apache.org/java/docs/) (instead of e.g. sphinx) for text searches
- [Siege](https://freshmeat.net/projects/siege/) instead of Apache **Benchmark **for performance benchmarking.
- [Pagespeed](https://code.google.com/speed/page-speed/download.html) is Google's **yslow**. So a client side benchmarking addon
  for Firefox. Supposed to be better than yslow at some things too. 2h blog
  reading a day... How did I miss this?
- To avoid DB stampede, use a 'two-expire' system: tracking a smaller expire
  date yourself inside memcached. When it expires reset it to the small time
  again & then update DB & memcached so value is valid again & only 1 request
  ended up processed by the database. Opossed to actual expiring where all
  *slashdotters* will massivily hit the DB and it may not even be able to
  restore the memcache entry.
- [Pecl/inclued](https://pecl.php.net/package/inclued) to show all source dependencies
- Don't use cache blacklisting. Use whitelisting instead. So e.g. cache
  elements, and don't use `<cake:nocache>` ever again.
- Use [Gearman](https://pear.php.net/package/Net-Gearman) to schedule out jobs to other machines.

(Thanks to: [@jperras](https://twitter.com/jperras), [@teemow](https://twitter.com/teemow))

## Source Control & Deploy

- Go use [github](https://github.com/) for your projects. No, really. Go... Are you still
  here?
- Ignoring files & branching in [GIT](https://git-scm.com/) is way easier than SVN
- [Assembla](https://www.assembla.com/) has a [continuous integration](https://en.wikipedia.org/wiki/Continuous-integration) AMI (virtual image) that
  you can launch right now @ Amazon EC2
- Use [Capistrano](https://en.wikipedia.org/wiki/Capistrano) for simultaneous deployment on multiple servers

(Thanks to: [@felixge](https://twitter.com/felixge), [@alkemann](https://twitter.com/alkemann), [@d1rk](https://twitter.com/d1rk), [@gwoo](https://twitter.com/gwoo), [@jperras](https://twitter.com/jperras))

## Testing

- [Selenium](https://seleniumhq.org/) can fire up browsers and test your sites like an actual
  user. It will tell you if unexpected output returns.
- Unit testing. "Be your own **giant**", make tests while you write not
  after. Then: refactor & optimize and be sure your test cases keep working.
- Don't go for 100% test coverage. Test **topdown** instead. That usually
  provides the best rendement: bugs reduced / time spent

(Thanks to: [@felixge](https://twitter.com/felixge), [@alkemann](https://twitter.com/alkemann))

## I18n & Translation

- Use %s inside \_\_() inside sprintf()
- Use the cake console to index all translate strings: \_\_() which can then
  be read by poedit
- [Poedit](https://www.poedit.net/) is a good tool for translators. It works for all OSes and
  developers/testers can afterwards flag invalid translations
- Language code must be in URL & not some cookie. Think of the crawlers.
- There is a Translate behavior to do the heavy lifting
- Instead of sprintf, one can also use the Cake built-in String::insert()
  method and have tokens (:name) instead of %s (or %1s)

(Thanks to: [@pierremartin](https://twitter.com/pierremartin), [@miglesias](https://twitter.com/miglesias))

## Security

- Use [PHP IDS](https://php-ids.org/) to detect [XSS](https://en.wikipedia.org/wiki/Cross-site-scripting) & [SQL Injection](https://en.wikipedia.org/wiki/SQL-injection) attacks
- Never trust mimetypes, they can & will be forged so people can upload
  scripts & just execute them
- Resizing images will get rid of harmful content. Never serve original
  files.

(Thanks to: [@nperson](https://twitter.com/nperson), [@teemow](https://twitter.com/teemow))

## Further Resources

### Day

- [@felixge](https://twitter.com/felixge)'s [Summary of CakeFest \[#3\](https://twitter.com/i/#!/search/%233)](https://debuggable.com/posts/summary-of-cakefest-3-berlin:4a5ca8fe-) over at the debuggable.com blog
  (checkout their new layout!)
- [@alkemann](https://twitter.com/alkemann)'s [CakeFest 3 roundup](https://illustrata.no/blog/2009/07/cakefest-3-roundup/) with some pictures as well
- Very cool [mind map](https://mindmeister.com/24339080) put together by [@timk\_](https://twitter.com/timk_) for presentation topics at

# CakeFest (via [@predominant](https://twitter.com/predominant))

- All of [@predominant](https://twitter.com/predominant)'s info on the [#cakefest](https://twitter.com/i/#!/search/%23cakefest) [downloads page](https://cakephp.org/downloads/CakeFest/CakeFest%203%20-%20Berlin%202009)

### Night

- You will find the Night Shots in the private "Cake War Faces" app that
  will launched when ready ; ) [James Fairhurst](https://www.jamesfairhurst.co.uk/) was already kind enough to
  donate me his "[CakeBattles](https://cakebattles.jamesfairhurst.co.uk/)" app which I will base it on.

You could check [@kvz](https://twitter.com/kvz) to find out when I launch it.

## Suggestions?

I already said this wouldn't be a complete summary of the event, basically
just the things I want to follow up on personally. But if you like me to add
or change something still, just drop a line below.
]]></content:encoded>
      <dc:date>2009-07-15T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Create Youtube-Like IDs With PHP/Python/Javascript/Java/SQL</title>
      <link>https://kvz.io/create-short-ids-with-php-like-youtube-or-tinyurl.html</link>
      <description><![CDATA[IDs are often numbers. Unfortunately there are only 10 digits to work with,
so if you have a lot of records, IDs tend to get very lengthy. For
computers that's OK. But human beings like their IDs as short as possible.
So how can we make IDs shorter? Well, we could borrow characters from
the alphabet as have them pose as additional numbers….
Alphabet to the rescue!
]]></description>
      <pubDate>Wed, 10 Jun 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/create-short-ids-with-php-like-youtube-or-tinyurl.html</guid>
      <content:encoded><![CDATA[IDs are often numbers. Unfortunately there are only 10 digits to work with,
so if you have a lot of records, IDs tend to get very lengthy. For
computers that's OK. But human beings like their IDs as short as possible.
So how can we **make IDs shorter**? Well, we could borrow characters from
the alphabet as have them pose as additional numbers....
Alphabet to the rescue!

<!--more-->

Other title options where

- How to create unique short string IDs with PHP & MySQL
- Or how to create IDs similar to YouTube e.g. yzNjIBEdyww

I created this function a long time ago. Time to be **nice and share**.

## More is Less - the 'math'

The alphabet has 26 characters. That's a lot more than 10 digits.
If we also distinguish upper- and lowercase,
and add digits to the bunch for the heck of it,
we already have (26 x 2 + 10) **62 options** we can use **per position**
in the ID.

Now of course we can also add additional funny characters to
'the bunch' like - / \* & #  but those may cause problems in URLs
and that's our target audience for now.

OK so because there are roughly **6x more characters** we will
use per position, IDs will get much **shorter**.
We can just fit a lot **more data in each position**.

This is basically what url shortening services do like tinyurl,
is.gd, or bit.ly. But similar IDs can also be found at
youtube: [https://www.youtube.com/watch?v=**yzNjIBEdyww**](https://www.youtube.com/watch?v=yzNjIBEdyww)

## Convert Your IDs

Now unlike Database servers: webservers are easy to scale so you can
let them do a bit of converting to ease the life of your users, while
keeping your database fast with numbers
(MySQL really likes them plain numbers ; )

To do the conversion I've written a PHP function that can translate
big numbers to short strings and vice versa. I call it: alphaID.

The resulting string is not hard to decipher, but it can be a very
nice feature to make URLs or directory structures more compact and
significant.

So basically:

- when someone requests rLHWfKd
- alphaID() converts it to 999999999999
- you lookup the record for id 999999999999 in your database

## Source

```php
<?php
/**
 * Translates a number to a short alhanumeric version
 *
 * Translated any number up to 9007199254740992
 * to a shorter version in letters e.g.:
 * 9007199254740989 --> PpQXn7COf
 *
 * specifiying the second argument true, it will
 * translate back e.g.:
 * PpQXn7COf --> 9007199254740989
 *
 * this function is based on any2dec && dec2any by
 * fragmer[at]mail[dot]ru
 * see: https://nl3.php.net/manual/en/function.base-convert.php#52450
 *
 * If you want the alphaID to be at least 3 letter long, use the
 * $pad_up = 3 argument
 *
 * In most cases this is better than totally random ID generators
 * because this can easily avoid duplicate ID's.
 * For example if you correlate the alpha ID to an auto incrementing ID
 * in your database, you're done.
 *
 * The reverse is done because it makes it slightly more cryptic,
 * but it also makes it easier to spread lots of IDs in different
 * directories on your filesystem. Example:
 * $part1 = substr($alpha_id,0,1);
 * $part2 = substr($alpha_id,1,1);
 * $part3 = substr($alpha_id,2,strlen($alpha_id));
 * $destindir = "/".$part1."/".$part2."/".$part3;
 * // by reversing, directories are more evenly spread out. The
 * // first 26 directories already occupy 26 main levels
 *
 * more info on limitation:
 * - https://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/165372
 *
 * if you really need this for bigger numbers you probably have to look
 * at things like: https://theserverpages.com/php/manual/en/ref.bc.php
 * or: https://theserverpages.com/php/manual/en/ref.gmp.php
 * but I haven't really dugg into this. If you have more info on those
 * matters feel free to leave a comment.
 *
 * The following code block can be utilized by PEAR's Testing_DocTest
 * <code>
 * // Input //
 * $number_in = 2188847690240;
 * $alpha_in  = "SpQXn7Cb";
 *
 * // Execute //
 * $alpha_out  = alphaID($number_in, false, 8);
 * $number_out = alphaID($alpha_in, true, 8);
 *
 * if ($number_in != $number_out) {
 *   echo "Conversion failure, ".$alpha_in." returns ".$number_out." instead of the ";
 *   echo "desired: ".$number_in."\n";
 * }
 * if ($alpha_in != $alpha_out) {
 *   echo "Conversion failure, ".$number_in." returns ".$alpha_out." instead of the ";
 *   echo "desired: ".$alpha_in."\n";
 * }
 *
 * // Show //
 * echo $number_out." => ".$alpha_out."\n";
 * echo $alpha_in." => ".$number_out."\n";
 * echo alphaID(238328, false)." => ".alphaID(alphaID(238328, false), true)."\n";
 *
 * // expects:
 * // 2188847690240 => SpQXn7Cb
 * // SpQXn7Cb => 2188847690240
 * // aaab => 238328
 *
 * </code>
 *
 * @author  Kevin van Zonneveld <kevin@vanzonneveld.net>
 * @author  Simon Franz
 * @author  Deadfish
 * @author  SK83RJOSH
 * @copyright 2008 Kevin van Zonneveld (https://kevin.vanzonneveld.net)
 * @license   https://www.opensource.org/licenses/bsd-license.php New BSD Licence
 * @version   SVN: Release: $Id: alphaID.inc.php 344 2009-06-10 17:43:59Z kevin $
 * @link    https://kevin.vanzonneveld.net/
 *
 * @param mixed   $in   String or long input to translate
 * @param boolean $to_num  Reverses translation when true
 * @param mixed   $pad_up  Number or boolean padds the result up to a specified length
 * @param string  $pass_key Supplying a password makes it harder to calculate the original ID
 *
 * @return mixed string or long
 */
function alphaID($in, $to_num = false, $pad_up = false, $pass_key = null)
{
  $out   =   '';
  $index = 'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ';
  $base  = strlen($index);

  if ($pass_key !== null) {
    // Although this function's purpose is to just make the
    // ID short - and not so much secure,
    // with this patch by Simon Franz (https://blog.snaky.org/)
    // you can optionally supply a password to make it harder
    // to calculate the corresponding numeric ID

    for ($n = 0; $n < strlen($index); $n++) {
      $i[] = substr($index, $n, 1);
    }

    $pass_hash = hash('sha256',$pass_key);
    $pass_hash = (strlen($pass_hash) < strlen($index) ? hash('sha512', $pass_key) : $pass_hash);

    for ($n = 0; $n < strlen($index); $n++) {
      $p[] =  substr($pass_hash, $n, 1);
    }

    array_multisort($p, SORT_DESC, $i);
    $index = implode($i);
  }

  if ($to_num) {
    // Digital number  <<--  alphabet letter code
    $len = strlen($in) - 1;

    for ($t = $len; $t >= 0; $t--) {
      $bcp = bcpow($base, $len - $t);
      $out = $out + strpos($index, substr($in, $t, 1)) * $bcp;
    }

    if (is_numeric($pad_up)) {
      $pad_up--;

      if ($pad_up > 0) {
        $out -= pow($base, $pad_up);
      }
    }
  } else {
    // Digital number  -->>  alphabet letter code
    if (is_numeric($pad_up)) {
      $pad_up--;

      if ($pad_up > 0) {
        $in += pow($base, $pad_up);
      }
    }

    for ($t = ($in != 0 ? floor(log($in, $base)) : 0); $t >= 0; $t--) {
      $bcp = bcpow($base, $t);
      $a   = floor($in / $bcp) % $base;
      $out = $out . substr($index, $a, 1);
      $in  = $in - ($a * $bcp);
    }
  }

  return $out;
}
```

[Get from GitHub](https://raw.github.com/kvz/deprecated/kvzlib/php/functions/alphaID.inc.php)

## Example

Running:

```php
alphaID(9007199254740989);
```

will return `PpQXn7COf` and:

```php
alphaID('PpQXn7COf', true);
```

will return `9007199254740989`

Easy right?

## More Features

- There also is an optional third argument: `$pad_up`. This enables you to
  make the resulting alphaID at least **X** characters long.
- You can support even more characters (making the resulting alphaID
  even smaller) by adding characters to the `$index` var at the top of the
  function body.

## Bonus

Thanks to some wonderful contributions in the comment section,
here are some interesting updates & additions:

### Pro Tip

You may want to remove vouwels (a, e, o, u, i) from `$index`
as to avoid combinations that result in: 'penis' or other dirty words
that could get your customers upset.

You can also use the `$pad_up` argument to enforce a minimum length
of 5 characters as to avoid: 'nsfw' and 'wtf'.

Thanks to William for pointing this out ; )

### Postgres Implementation

Thanks to William as well:

```sql
CREATE OR REPLACE FUNCTION string_to_bits(input_text TEXT)
RETURNS TEXT AS $$
DECLARE
output_text TEXT;
i INTEGER;
BEGIN
output_text := '';


FOR i IN 1..char_length(input_text) LOOP
output_text := output_text || ascii(substring(input_text FROM i FOR 1))::bit(8);
END LOOP;


return output_text;
END;
$$ LANGUAGE plpgsql;


CREATE OR REPLACE FUNCTION id_to_sid(id INTEGER)
RETURNS TEXT AS $$
DECLARE
output_text TEXT;
i INTEGER;
index TEXT[];
bits TEXT;
bit_array TEXT[];
input_text TEXT;
BEGIN
input_text := id::TEXT;
output_text := '';
index := string_to_array('0,d,A,3,E,z,W,m,D,S,Q,l,K,s,P,b,N,c,f,j,5,I,t,C,i,y,o,G,2,r,x,h,V,J,k,-,T,w,H,L,9,e,u,X,p,U,a,O,v,4,R,B,q,M,n,g,1,F,6,Y,_,8,7,Z', ',');

bits := string_to_bits(input_text);

IF length(bits) % 6 <> 0 THEN
bits := rpad(bits, length(bits) + 6 - (length(bits) % 6), '0');
END IF;

FOR i IN 1..((length(bits) / 6)) LOOP
IF i = 1 THEN
bit_array[i] := substring(bits FROM 1 FOR 6);
ELSE
bit_array[i] := substring(bits FROM 1 + (i - 1) * 6 FOR 6);
END IF;

output_text := output_text || index[bit_array[i]::bit(6)::integer + 1];
END LOOP;


return output_text;
END;
$$ LANGUAGE plpgsql;
```

### Java Implementation

Thanks to [Ant Kutschera](https://blog.maxant.co.uk/pebble/2010/02/02/1265138340000.html) there also is a Java version.

```java
package uk.co.maxant.util;

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

/**
 * allows you to convert a whole number into a compacted representation of that number,
 * based upon the dictionary you provide. very similar to base64 encoding, or indeed hex
 * encoding.
 */
public class BaseX {

  /**
   * contains hexadecimals 0-F only.
   */
  public static final char[] DICTIONARY_16 =
    new char[]{'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};

  /**
   * contains only alphanumerics, in capitals and excludes letters/numbers which can be confused,
   * eg. 0 and O or L and I and 1.
   */
  public static final char[] DICTIONARY_32 =
    new char[]{'1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','J','K','M','N','P','Q','R','S','T','U','V','W','X','Y','Z'};

  /**
   * contains only alphanumerics, including both capitals and smalls.
   */
  public static final char[] DICTIONARY_62 =
    new char[]{'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'};

  /**
   * contains alphanumerics, including both capitals and smalls, and the following special chars:
   * +"@*#%&/|()=?'~[!]{}-_:.,; (you might not be able to read all those using a browser!
   */
  public static final char[] DICTIONARY_89 =
    new char[]{'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','+','"','@','*','#','%','&','/','|','(',')','=','?','~','[',']','{','}','$','-','_','.',':',',',';','<','>'};

  protected char[] dictionary;

  /**
   * create an encoder with the given dictionary.
   *
   * @param dictionary the dictionary to use when encoding and decoding.
   */
  public BaseX(char[] dictionary){
    this.dictionary = dictionary;
  }

  /**
   * creates an encoder with the {@link #DICTIONARY_62} dictionary.
   *
   * @param dictionary the dictionary to use when encoding and decoding.
   */
  public BaseX(){
    this.dictionary = DICTIONARY_62;
  }

  /**
   * tester method.
   */
  public static void main(String[] args) {
    String original = "123456789012345678901234567890";
    System.out.println("Original: " + original);
    BaseX bx = new BaseX(DICTIONARY_62);
    String encoded = bx.encode(new BigInteger(original));
    System.out.println("encoded: " + encoded);
    BigInteger decoded = bx.decode(encoded);
    System.out.println("decoded: " + decoded);
    if(original.equals(decoded.toString())){
      System.out.println("Passed! decoded value is the same as the original.");
    }else{
      System.err.println("FAILED! decoded value is NOT the same as the original!!");
    }
  }

  /**
   * encodes the given string into the base of the dictionary provided in the constructor.
   * @param value the number to encode.
   * @return the encoded string.
   */
  public String encode(BigInteger value) {

    List<Character> result = new ArrayList<Character>();
    BigInteger base = new BigInteger("" + dictionary.length);
    int exponent = 1;
    BigInteger remaining = value;
    while(true){
      BigInteger a = base.pow(exponent); //16^1 = 16
      BigInteger b = remaining.mod(a); //119 % 16 = 7 | 112 % 256 = 112
      BigInteger c = base.pow(exponent - 1);
      BigInteger d = b.divide(c);

      //if d > dictionary.length, we have a problem. but BigInteger doesnt have
      //a greater than method :-(  hope for the best. theoretically, d is always
      //an index of the dictionary!
      result.add(dictionary[d.intValue()]);
      remaining = remaining.subtract(b); //119 - 7 = 112 | 112 - 112 = 0

      //finished?
      if(remaining.equals(BigInteger.ZERO)){
        break;
      }

      exponent++;
    }

    //need to reverse it, since the start of the list contains the least significant values
    StringBuffer sb = new StringBuffer();
    for(int i = result.size()-1; i >= 0; i--){
      sb.append(result.get(i));
    }
    return sb.toString();
  }

  /**
   * decodes the given string from the base of the dictionary provided in the constructor.
   * @param str the string to decode.
   * @return the decoded number.
   */
  public BigInteger decode(String str) {

    //reverse it, coz its already reversed!
    char[] chars = new char[str.length()];
    str.getChars(0, str.length(), chars, 0);

    char[] chars2 = new char[str.length()];
    int i = chars2.length -1;
    for(char c : chars){
      chars2[i--] = c;
    }

    //for efficiency, make a map
    Map<Character, BigInteger> dictMap = new HashMap<Character, BigInteger>();
    int j = 0;
    for(char c : dictionary){
      dictMap.put(c, new BigInteger("" + j++));
    }

    BigInteger bi = BigInteger.ZERO;
    BigInteger base = new BigInteger("" + dictionary.length);
    int exponent = 0;
    for(char c : chars2){
      BigInteger a = dictMap.get(c);
      BigInteger b = base.pow(exponent).multiply(a);
      bi = bi.add(new BigInteger("" + b));
      exponent++;
    }

    return bi;

  }
}
```

### JavaScript Implementation

Thanks to Even Simon, there's a JavaScript implementation.
You will also find PHP version there, that implements the encode & decode
functions as separate methods in a class.

```javascript
/**
 *  Javascript AlphabeticID class
 *  (based on a script by Kevin van Zonneveld <kevin@vanzonneveld.net>)
 *
 *  Author: Even Simon <even.simon@gmail.com>
 *
 *  Description: Translates a numeric identifier into a short string and backwords.
 *
 *  Usage:
 *    var str = AlphabeticID.encode(9007199254740989); // str = 'fE2XnNGpF'
 *    var id = AlphabeticID.decode('fE2XnNGpF'); // id = 9007199254740989;
 **/

var AlphabeticID = {
  index:'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ',

  /**
   *  [@function](https://twitter.com/function) AlphabeticID.encode
   *  [@description](https://twitter.com/description) Encode a number into short string
   *  [@param](https://twitter.com/param) integer
   *  [@return](https://twitter.com/return) string
   **/
  encode:function(_number){
    if('undefined' == typeof _number){
      return null;
    }
    else if('number' != typeof(_number)){
      throw new Error('Wrong parameter type');
    }

    var ret = '';

    for(var i=Math.floor(Math.log(parseInt(_number))/Math.log(AlphabeticID.index.length));i>=0;i--){
      ret = ret + AlphabeticID.index.substr((Math.floor(parseInt(_number) / AlphabeticID.bcpow(AlphabeticID.index.length, i)) % AlphabeticID.index.length),1);
    }

    return ret.reverse();
  },

  /**
   *  [@function](https://twitter.com/function) AlphabeticID.decode
   *  [@description](https://twitter.com/description) Decode a short string and return number
   *  [@param](https://twitter.com/param) string
   *  [@return](https://twitter.com/return) integer
   **/
  decode:function(_string){
    if('undefined' == typeof _string){
      return null;
    }
    else if('string' != typeof _string){
      throw new Error('Wrong parameter type');
    }

    var str = _string.reverse();
    var ret = 0;

    for(var i=0;i<=(str.length - 1);i++){
      ret = ret + AlphabeticID.index.indexOf(str.substr(i,1)) * (AlphabeticID.bcpow(AlphabeticID.index.length, (str.length - 1) - i));
    }

    return ret;
  },

  /**
   *  [@function](https://twitter.com/function) AlphabeticID.bcpow
   *  [@description](https://twitter.com/description) Raise _a to the power _b
   *  [@param](https://twitter.com/param) float _a
   *  [@param](https://twitter.com/param) integer _b
   *  [@return](https://twitter.com/return) string
   **/
  bcpow:function(_a, _b){
    return Math.floor(Math.pow(parseFloat(_a), parseInt(_b)));
  }
};

/**
 *  [@function](https://twitter.com/function) String.reverse
 *  [@description](https://twitter.com/description) Reverse a string
 *  [@return](https://twitter.com/return) string
 **/
String.prototype.reverse = function(){
  return this.split('').reverse().join('');
};
```

### C# Implementation

Thanks to Romas, there's a C# implementation.

Improved by [rumble|strip](https://twitter.com/rsadventure/status/481138491300933633)

```csharp
class ShortId
{
    public static readonly string Alphabet = "abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";

    private static decimal BcPow(double a, double b)
    {
      return Math.Floor((decimal)Math.Pow(a, b));
    }

    public static ulong Decode(string value, int pad = 0)
    {
      value = value.ReverseString();
      var len = value.Length - 1;
      ulong result = 0;

      for (int t = len; t >= 0; t--)
      {
        var bcp = (ulong)BcPow(Alphabet.Length, len - t);
        result += (ulong)Alphabet.IndexOf(value[t]) * bcp;
      }

      if (pad > 0)
      {
        result -= (ulong)BcPow(Alphabet.Length, pad);
      }

      return result;
    }

    public static string Encode(byte[] value, int startIndex = 0, int pad = 0)
    {
      return Encode(BitConverter.ToUInt64(value, startIndex), pad);
    }

    public static string Encode(Guid guid, int pad = 0)
    {
      var bytes = guid.ToByteArray();

      var first = Encode(bytes, 0, pad);
      var second = Encode(bytes, 8, pad);

      return first + second;
    }

    public static string Encode(ulong value, int pad = 0)
    {
      var result = string.Empty;

      if (pad > 0)
      {
        value += (ulong)BcPow(Alphabet.Length, pad);
      }

      for (var t = (value != 0 ? Math.Floor(Math.Log(value, Alphabet.Length)) : 0); t >= 0; t--)
      {
        var bcp = (ulong)BcPow(Alphabet.Length, t);
        var a = ((ulong)Math.Floor((decimal)value / (decimal)bcp)) % (ulong)Alphabet.Length;
        result += Alphabet[(int)a];
        value  = value - (a * bcp);
      }

      return result.ReverseString();
    }

    private static string ReverseString(this string value)
    {
      char[] arr = value.ToCharArray();
      Array.Reverse(arr);
      return new string(arr);
    }
  }
```

### Python Implementations

Thanks to [wessite](https://wessite.com/), there's a Python implementation.

```python
ALPHABET = "bcdfghjklmnpqrstvwxyz0123456789BCDFGHJKLMNPQRSTVWXYZ"
BASE = len(ALPHABET)
MAXLEN = 6

def encode_id(self, n):

    pad = self.MAXLEN - 1
    n = int(n + pow(self.BASE, pad))

    s = []
    t = int(math.log(n, self.BASE))
    while True:
        bcp = int(pow(self.BASE, t))
        a = int(n / bcp) % self.BASE
        s.append(self.ALPHABET[a:a+1])
        n = n - (a * bcp)
        t -= 1
        if t < 0: break

    return "".join(reversed(s))

def decode_id(self, n):

    n = "".join(reversed(n))
    s = 0
    l = len(n) - 1
    t = 0
    while True:
        bcpow = int(pow(self.BASE, l - t))
        s = s + self.ALPHABET.index(n[t:t+1]) * bcpow
        t += 1
        if t > l: break

    pad = self.MAXLEN - 1
    s = int(s - pow(self.BASE, pad))

    return int(s)
```

[Noah Miller](https://www.facebook.com/obnyis) contributed a version based on
Wessite's, and changed it so it can use a passkey, and rolled it into one function:

```python
import math
import hashlib

ALPHABET = "abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ"

def alphaID(idnum, to_num=False, pad_up=False, passkey=None):
  index = ALPHABET
  if passkey:
    i = list(index)
    passhash = hashlib.sha256(passkey).hexdigest()
    passhash = hashlib.sha512(passkey).hexdigest() if len(passhash) < len(index) else passhash
    p = list(passhash)[0:len(index)]
    index = ''.join(zip(*sorted(zip(p,i)))[1])

  base = len(index)

  if to_num:
    idnum = idnum[::-1]
    out = 0
    length = len(idnum) -1
    t = 0
    while True:
      bcpow = int(pow(base, length - t))
      out = out + index.index(idnum[t:t+1]) * bcpow
      t += 1
      if t > length: break

    if pad_up:
      pad_up -= 1
      if pad_up > 0:
        out -= int(pow(base, pad_up))
  else:
    if pad_up:
      pad_up -= 1
      if pad_up > 0:
        idnum += int(pow(base, pad_up))

    out = []
    t = int(math.log(idnum, base))
    while True:
      bcp = int(pow(base, t))
      a = int(idnum / bcp) % base
      out.append(index[a:a+1])
      idnum = idnum - (a * bcp)
      t -= 1
      if t < 0: break

    out = ''.join(out[::-1])

  return out
```

### HaXe Implementation

Thanks to [Andy Li](https://www.onthewings.net/), there's a HaXe implementation.

```java
/**
 *  HaXe version of AlphabeticID
 *  Author: Andy Li <andy@onthewings.net>
 *  ported from...
 *
 *  Javascript AlphabeticID class
 *  Author: Even Simon <even.simon@gmail.com>
 *  which is based on a script by Kevin van Zonneveld <kevin@vanzonneveld.net>)
 *
 *  Description: Translates a numeric identifier into a short string and backwords.
 *  https://kevin.vanzonneveld.net/techblog/article/create_short_ids_with_php_like_youtube_or_tinyurl/
 **/

class AlphaID {
    static public var index:String = 'abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ';

    static public function encode(_number:Int):String {
        var strBuf = new StringBuf();

        var i = 0;
        var end = Math.floor(Math.log(_number)/Math.log(index.length));
        while(i <= end) {
            strBuf.add(index.charAt((Math.floor(_number / bcpow(index.length, i++)) % index.length)));
        }

        return strBuf.toString();
    }

    static public function decode(_string:String):Int {
        var str = reverseString(_string);
        var ret = 0;

        var i = 0;
        var end = str.length - 1;
        while(i <= end) {
            ret += Std.int(index.indexOf(str.charAt(i)) * (bcpow(index.length, end-i)));
            ++i;
        }

        return ret;
    }

    inline static private function bcpow(_a:Float, _b:Float):Float {
        return Math.floor(Math.pow(_a, _b));
    }

    inline static private function reverseString(inStr:String):String {
        var ary = inStr.split("");
        ary.reverse();
        return ary.join("");
    }
}
```

### Go Implementation

Thanks to [Dinesh Appavoo](https://github.com/dineshappavoo), there's a Go implementation.

```go
// Package basex generates alpha id (alphanumeric id) for big integers.  This
// is particularly useful for shortening URLs.
package basex

import (
	"math/big"
	"strconv"
)

var (
	dictionary = []byte{'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'}
)

// Encode converts the big integer to alpha id (an alphanumeric id with mixed cases)
func Encode(val string) string {
	var result []byte
	var index int
	var strVal string

	base := big.NewInt(int64(len(dictionary)))
	a := big.NewInt(0)
	b := big.NewInt(0)
	c := big.NewInt(0)
	d := big.NewInt(0)

	exponent := 1

	remaining := big.NewInt(0)
	remaining.SetString(val, 10)

	for remaining.Cmp(big.NewInt(0)) != 0 {
		a.Exp(base, big.NewInt(int64(exponent)), nil) //16^1 = 16
		b = b.Mod(remaining, a)                       //119 % 16 = 7 | 112 % 256 = 112
		c = c.Exp(base, big.NewInt(int64(exponent-1)), nil)
		d = d.Div(b, c)

		//if d > dictionary.length, we have a problem. but BigInteger doesnt have
		//a greater than method :-(  hope for the best. theoretically, d is always
		//an index of the dictionary!
		strVal = d.String()
		index, _ = strconv.Atoi(strVal)
		result = append(result, dictionary[index])
		remaining = remaining.Sub(remaining, b) //119 - 7 = 112 | 112 - 112 = 0
		exponent = exponent + 1
	}

	//need to reverse it, since the start of the list contains the least significant values
	return string(reverse(result))
}

// Decode converts the alpha id to big integer
func Decode(s string) string {
	//reverse it, coz its already reversed!
	chars2 := reverse([]byte(s))

	//for efficiency, make a map
	dictMap := make(map[byte]*big.Int)

	j := 0
	for _, val := range dictionary {
		dictMap[val] = big.NewInt(int64(j))
		j = j + 1
	}

	bi := big.NewInt(0)
	base := big.NewInt(int64(len(dictionary)))

	exponent := 0
	a := big.NewInt(0)
	b := big.NewInt(0)
	intermed := big.NewInt(0)

	for _, c := range chars2 {
		a = dictMap[c]
		intermed = intermed.Exp(base, big.NewInt(int64(exponent)), nil)
		b = b.Mul(intermed, a)
		bi = bi.Add(bi, b)
		exponent = exponent + 1
	}
	return bi.String()
}

func reverse(bs []byte) []byte {
	for i, j := 0, len(bs)-1; i < j; i, j = i+1, j-1 {
		bs[i], bs[j] = bs[j], bs[i]
	}
	return bs
}
```


### C++ Implementation

Thanks to [Kay Makowsky](https://github.com/KMakowsky), there's a C++ implementation.

```cpp
//
//  ShortID.h
//  SocketServer
//
//  Created by Kay Makowsky on 16.06.16.
//  Copyright © 2016 Kay Makowsky. All rights reserved.
//

#ifndef ShortID_h
#define ShortID_h

#include <string>
#include <cmath>

const static std::string Alphabet = "abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";

class ShortID
{
public:
    static double BcPow(double a, double b)
    {
        return std::floor(std::pow(a, b));
    }
    
    static long Decode(std::string value, int pad = 0)
    {
        long len = value.length() - 1;
        unsigned long result = 0;
        
        for (long t = len; t >= 0; t--)
        {
            unsigned long bcp = (unsigned long)BcPow(Alphabet.length(), len - t);
            result += (unsigned long)Alphabet.find(value[t]) * bcp;
        }
        
        if (pad > 0)
        {
            result -= (unsigned long)BcPow(Alphabet.length(), pad);
        }
        
        return result;
    }
    
    static std::string Encode(unsigned long value, int pad = 0)
    {
        std::string result = "";
        
        if (pad > 0)
        {
            value += (unsigned long)BcPow(Alphabet.length(), pad);
        }
        int lg = std::log(value) / std::log(Alphabet.length());
        for (int t = (value != 0 ? lg : 0); t >= 0; t--)
        {
            unsigned long bcp = (unsigned long)BcPow(Alphabet.length(), t);
            unsigned long a = ((unsigned long)std::floor((double)value / (double)bcp)) % (unsigned long)Alphabet.length();
            result += Alphabet[(int)a];
            value  = value - (a * bcp);
        }
        
        return result;
    }
    
private:
    static std::string reverseString(std::string value)
    {
        return std::string (value.rbegin(), value.rend());
    }
};

#endif
```
]]></content:encoded>
      <dc:date>2009-06-10T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Install the Best Coding Font</title>
      <link>https://kvz.io/install-the-best-coding-font.html</link>
      <description><![CDATA[If you are in IT professionally (coding or sysadmin) you will be staring at
monospaced fonts for many many hours a day. So it's probably justified to
spend 2 minutes picking a very good one. It can make your work (typing ; )
just a little bit more pleasing.
]]></description>
      <pubDate>Mon, 25 May 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/install-the-best-coding-font.html</guid>
      <content:encoded><![CDATA[If you are in IT professionally (coding or sysadmin) you will be staring at
[monospaced](https://en.wikipedia.org/wiki/Monospaced-font) fonts for many many hours a day. So it's probably justified to
spend 2 minutes picking a very good one. It can make your work (typing ; )
just a little bit more pleasing.

<!--more-->

I did some research and the [Inconsolata font](https://www.levien.com/type/myfonts/inconsolata.html) by [Raph Levien](https://levien.com/) is
[considered](https://www.actsofvolition.com/archive/2007/september/inconsolata) one of the [best](https://hivelogic.com/articles/view/top-10-programming-fonts) programming [fonts](https://wiki.ubuntu.com/Fonts) by many. I must
say it's pretty good on the eyes, but decide for yourself:

![inconsolata.png](/assets/images/posts/inconsolata.png "inconsolata.png")

You can [download it](https://www.levien.com/type/myfonts/Inconsolata.otf) and install it yourself, or:

## Install the Font on Ubuntu

As suggested by Gekkio in the comments section:

> "If you're using Ubuntu 9.04, there's also a packaged version available in
>   the repos which should setup things perfectly:
> sudo aptitude install ttf-inconsolata
> (Requires universe repos to be enabled)"

Thank you Gekkio!

So just copy-paste that. Your font cache will be refreshed and when you start
a new **Terminal** or **IDE**, you should be able to select the Inconsolata
font.

## NetBeans Anyone?

NetBeans didn't seem to support the Inconsolata font but as suggested by Filip
Juki in the comments section:

> "It seems that NetBeans doesn't support OTF fonts. You might try converting
>   it to TTF using FontForge, I decided it wasn't worth the hassle."

So I decided to do **just that** and I now have Inconsolata in [My new IDE: NetBeans](/blog/2008/12/02/my-new-ide-netbeans/).

### Download the TTF Version

If you need the TTF version for NetBeans (or another IDE that doesn't support
OTF), [download it here](./assets/images/posts/Inconsolata.ttf).

### Install the TTF Version (Ubuntu Only)

If you have Ubunbtu you can paste the following in a terminal:

```bash
[ "$(whoami)" = "root" ] && echo "No this time you really can't be root ; )" && exit 1
sudo echo "Installing inconsolata font..."
mkdir -p "~/.fonts/"
cd ~/.fonts/
wget https://kvz.io/assets/images/posts/Inconsolata.ttf
sudo echo "Refreshing cache..."
sudo fc-cache -f -v
sudo echo "Done."
```

## Question

**What is your favorite coding / sysadmin font?**
]]></content:encoded>
      <dc:date>2009-05-25T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Fix Flash Problems on Ubuntu</title>
      <link>https://kvz.io/fix-flash-problems-on-ubuntu.html</link>
      <description><![CDATA[I had some difficulties playing Flash videos lately. Problems ranged from
lagging sound, to ugliness, to idling black screens, to strange gray Play
buttons that didn't do anything. The following solved my Flash issues on
Ubuntu.
]]></description>
      <pubDate>Sun, 17 May 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/fix-flash-problems-on-ubuntu.html</guid>
      <content:encoded><![CDATA[I had some difficulties playing Flash videos lately. Problems ranged from
lagging sound, to ugliness, to idling black screens, to strange gray Play
buttons that didn't do anything. The following solved my Flash issues on
Ubuntu.

<!--more-->

A very **short** & simple article this time.

You could take the **graphical** approach (use Synaptic & visit Adobe site),
but I'm using **commandline** cause copy-pasting is really fast and concise,
so here we go:

## Cleanup

Open a terminal and type:

```bash
$ sudo echo -n "Cleaning up... "
$ sudo aptitude purge swfdec-mozilla
$ sudo aptitude purge adobe-flashplugin
$ sudo echo "[done]"
```

## Install Adobe Flashplayer 10 on Ubuntu

Open a terminal and type:

```bash
$ sudo echo -n "Downloading... " \
 && pushd /tmp \
 && wget https://fpdownload.macromedia.com/get/flashplayer/current/install_flash_player_10_linux.deb \
 && sudo echo "[done]" \
 && sudo echo -n "Installing... " \
 && sudo aptitude install libcurl3 \
 && sudo dpkg -i ./install_flash_player_10_linux.deb \
 && popd
 && sudo echo "[done]" \
```

## That's it

Restart Firefox and go & enjoy some [Youtube video](https://www.youtube.com/watch?v=9X2u2cdvJSg#t=0m02s)s again :)
]]></content:encoded>
      <dc:date>2009-05-17T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Have Fun With Google Chart</title>
      <link>https://kvz.io/have-fun-with-google-chart.html</link>
      <description><![CDATA[Pictures say more than a thousand words. This is true for your data as well.
With Google Chart you can now easily generate charts of your data. No
expertise required. Just make sure you format your data correctly, add it to
the Google Chart URL, and it will return a nice graph.
]]></description>
      <pubDate>Wed, 01 Apr 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/have-fun-with-google-chart.html</guid>
      <content:encoded><![CDATA[Pictures say more than a thousand words. This is true for your data as well.
With Google Chart you can now easily generate charts of your data. No
expertise required. Just make sure you format your data correctly, add it to
the Google Chart URL, and it will return a nice graph.

<!--more-->

It was very simple to create the following graph:

![...man|Not+resembling+Pac-man&chp=0.628](https://chart.apis.google.com/chart?cht=pc&chtt=Percentage+of+Chart+Which+Resembles+Pac-man&chco=FDFF00|49005F&chs=500x200&chd=t:78,22&chl=Resembling+Pac-man|Not+resembling+Pac-man&chp=0.628 "...man|Not+resembling+Pac-man&chp=0.628")

.. with the following URL:

```bash
https://chart.apis.google.com/chart?cht=pc
&chtt=Percentage+of+Chart+Which+Resembles+Pac-man
&chco=FDFF00|49005F
&chs=500x200
&chd=t:78,22
&chl=Resembling+Pac-man|Not+resembling+Pac-man
&chp=0.628
```

I'm not about to write full documentation on how-to do this, because [Google
already did that](https://code.google.com/apis/chart/types.html). Besides, I have better things to do like using their API
to show [nice benchmarking](/blog/2009/03/31/improve-mysql-insert-performance/) [graphs](https://phpjs.org/statistics/index).

Enjoy!
]]></content:encoded>
      <dc:date>2009-04-01T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Improve MySQL Insert Performance</title>
      <link>https://kvz.io/improve-mysql-insert-performance.html</link>
      <description><![CDATA[Sometimes MySQL needs to work hard. I've been working on an import script that
fires a lot of INSERTs. Normally our database server handles 1,000 inserts /
sec. That wasn't enough. So I went looking for methods to improve the speed of
MySQL inserts and was finally able to increase this number to 28,000 inserts
per second. Checkout my late night benchmarking adventures.
]]></description>
      <pubDate>Tue, 31 Mar 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/improve-mysql-insert-performance.html</guid>
      <content:encoded><![CDATA[Sometimes MySQL needs to work hard. I've been working on an import script that
fires a lot of INSERTs. Normally our database server handles 1,000 inserts /
sec. That wasn't enough. So I went looking for methods to improve the speed of
MySQL inserts and was finally able to increase this number to 28,000 inserts
per second. Checkout my late night benchmarking adventures.

<!--more-->

I'm going to show you the result of 3 approaches that I tried to boost the
speed of 'bulk' queries:

- Delayed Insert
- Transaction
- Load Data

This article focusses on the InnoDB storage engine.

## Delayed Insert

MySQL has an [INSERT DELAYED](https://www.chapter31.com/2008/05/22/insert-delayed-with-mysql/) feature.  Despite the name this is actually
meant to speedup your queries ; ) And from what I understand it does a very
good job.

**Unfortunately** it only works with **MyISAM**, MEMORY, ARCHIVE, and
BLACKHOLE tables.

That rules out my favorite storage engine of the moment: **InnoDB.**

So where to turn?

## Transaction

A Transaction basically combines multiple queries in 1 'package'. If 1 query
in this package fails: you can 'cancel' all the queries within that package
also.

So that provides additional **integrity** to your relational data because if
record A could **not** be deleted but depends on record B which **could** be
deleted, you have a broken dependency in your database and that corruption
could have easily been avoided using a Transaction.

Let me show you **how easy** a transaction really is in basic **PHP/SQL**
terms:

```php
<?php
mysql_query("START TRANSACTION");
mysql_query("INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')");
?>
```

```php
<?php
mysql_query("INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')");
?>
```

```php
<?php
mysql_query("INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')");
?>
```

```php
<?php
mysql_query("COMMIT"); // Or "ROLLBACK" if you changed your mind
?>
```

OK moving on :)

### Transaction Performance - the Theory

I showed you the integrity gain. That's reason enough to 'go Transactional'
right now. But as an added bonus, Transactions could also be used for
performance gain. **How?**

- Normally your database table gets **re-indexed after every insert**.
  That's some heavy lifting for you database.

But when your queries are wrapped inside a Transaction, the table does not get
**re-indexed until after** this entire bulk is processed. Saving a lot of
work.

> Bulk processing will be the key to performance gain.

### Bench Results

So far the theory. Now let's benchmark this. What does it gain us in **queries
per second (qps)** terms:

<img src="https://chart.apis.google.com/chart?chd=t%3A20.75%2C31.85%2C38.45%2C46.65%2C53.6%2C62.6%2C64.65%2C65.45%2C66.4%2C65.5|12%2C21%2C26.7%2C34.9%2C45.35%2C60.95%2C64.9%2C60.35%2C65.9%2C65.55|23.65%2C32.25%2C38.8%2C44.35%2C51.95%2C64.75%2C65.9%2C67.65%2C66.6%2C64.6|58.5%2C57.7%2C65%2C62.15%2C65.9%2C67.65%2C68.3%2C67.9%2C64.35%2C66.25&chdl=transaction|transaction_lock|transaction_nokeys|none&chxl=0%3A|1|2|3|5|10|50|100|1000|10k|25k|1%3A|||400||800||1200||1600||2000|2%3A||||||inserts||||||3%3A||||||qps||||||&chtt=Insert+rates+using+4+different+bulk+methods&chxl2=inserts&chxl3=qps&cht=lc&chco=22FF22%2C0022FF%2CFF0000%2C00AAAA%2CFF00FF%2CFFA500%2CCC0000%2C0000CC%2C0080C0%2C8080C0%2CFF0080%2C800080%2C688E23%2C408080%2C808000%2C000000&chs=570x250&chxt=x%2Cy%2Ct%2Cr" />

As you can see

- I was not able to put this theory into practice and get good results.
- There is some overhead in the Transaction which actually causes a
  performance to **drop** for bulks with less than 50 queries.

I tried some other forms of transaction (showed in a graph below) but none of
them really hit the jackpot.

OK so Transactions are good to protect your data, and in **theory** can have
performance gain, but I was unable to produce that.

Clearly this wasn't the performance boost I was hoping for.

**Moving on.**

## Load Data - the Mother Load

MySQL has a very powerful way of processing bulks of data called [LOAD DATA
INFILE](https://dev.mysql.com/doc/refman/5.1/en/load-data.html). The LOAD DATA INFILE statement reads rows from a text file into a
table at a very high speed.

### Bench Results

In the following graph I tried to inserts different sized bulks of inserts
using different methods. I recorded & calculated in how much time each query
could be executed. I use the total time necessary for the entire operation,
and divide that by the number of queries. So what you see is really what you
get.

OK enough with these [so-called facts](https://www.youtube.com/watch?v=jOjfxEejS2Y#t=1m57s) ; ) Back the the excitement :D

At 10,000 records I was able to get a performance gain of **2,124.09%**

<img src ="https://chart.apis.google.com/chart?chd=t%3A1.85%2C3.41%2C4.66%2C7.41%2C13.43%2C34.35%2C37.41%2C85.41%2C93.69%2C89.92|0.92%2C1.55%2C2.49%2C4.22%2C7.71%2C26.94%2C36.6%2C81.96%2C89.22%2C87.33|1.19%2C1.91%2C2.45%2C3.12%2C3.61%2C4.73%2C8.21%2C6.71%2C7.12%2C8.46|1.5%2C2.11%2C2.64%2C2.95%2C1.74%2C4.2%2C4.2%2C4.07%2C4.29%2C4.3|0.76%2C1.25%2C1.8%2C2.25%2C3.05%2C4.05%2C4.11%2C4.44%2C4.34%2C4.32|1.44%2C2.22%2C2.68%2C3.04%2C3.56%2C4.25%2C4.36%2C4.04%2C4.33%2C4.31|3.54%2C3.92%2C4%2C4.23%2C4.29%2C4.52%2C4.42%2C4.21%2C4.39%2C4.23&chdl=loaddata|loaddata_unsafe|loadsql_unsafe|transaction|transaction_lock|transaction_nokeys|none&chxl=0%3A|1|2|3|5|10|50|100|1000|10k|25k|1%3A|||6000||12000||18000||24000||30000|2%3A||||||inserts||||||3%3A||||||qps||||||&chtt=Insert+rates+using+7+different+bulk+methods&chxl2=inserts&chxl3=qps&cht=lc&chco=22FF22%2C0022FF%2CFF0000%2C00AAAA%2CFF00FF%2CFFA500%2CCC0000%2C0000CC%2C0080C0%2C8080C0%2CFF0080%2C800080%2C688E23%2C408080%2C808000%2C000000&chs=570x250&chxt=x%2Cy%2Ct%2Cr" />

As you can see

- Where the Transaction method had maximum throughput of 1,588 inserts per
  second, Load Data allowed  MySQL to process process a staggering 28,108
  inserts per second.
- There is no siginifcant overhead in Load Data. e.g.  you can use this with
  2 queries per bulk and still have a performance increase of 153%.
- There is a **saturation** point around bulks of 10,000 inserts. After this
  point the queries per second rate (qps) didn't show an increase anymore.
- My advice would be to start a new bulk every **1,000 inserts**. It's what
  I concider to be the sweetspot because it keeps buffers small and you will
  still benefit from a performance gain of **2027.13%**.

The next step will make your buffer **1000%** bigger and it will only give you
an additional performance gain of **4%**.

So if you have a heavy-duty MySQL job that currently takes 1 hour to run, this
approach could make it run within 3 minutes! Enjoy the remaining 57 minutes of
your hour! :D

### Load Data Quirks

Of course there is a price to pay for this performance win. Before the data is
loaded, The data-file must be:

- Saved on disk (or in RAM, see my other article [Create turbocharged storage using tmpfs](/blog/2007/07/18/create-turbocharged-storage-using-tmpfs/))
- In comma-separated values (CSV) format.

This is probably **not something** you want to be bothered with. So why not
create a **PHP function** that handles these quirks for us?

## Wrapping This Up in a PHP Function

Let's save this logic inside a function so we can easily reuse it to our
benefit.

We'll name this function `mysqlBulk` and use it like this:

- Collect our queries or data in an array (the bulk).
- Feed that array along with the table name to the mysqlBulk function
- Have it return the qps for easy benchmarking. Or false on failure.

Source *(still working on this, will be updated regularly)*:

```php
<?php
/**
 * Executes multiple queries in a 'bulk' to achieve better
 * performance and integrity.
 *
 * @param array  $data   An array of queries. Except for loaddata methods. Those require a 2 dimensional array.
 * @param string $table
 * @param string $method
 * @param array  $options
 *
 * @return float
 */
function mysqlBulk(&$data, $table, $method = 'transaction', $options = array()) {
  // Default options
  if (!isset($options['query_handler'])) {
      $options['query_handler'] = 'mysql_query';
  }
  if (!isset($options['trigger_errors'])) {
      $options['trigger_errors'] = true;
  }
  if (!isset($options['trigger_notices'])) {
      $options['trigger_notices'] = true;
  }
  if (!isset($options['eat_away'])) {
      $options['eat_away'] = false;
  }
  if (!isset($options['in_file'])) {
      // AppArmor may prevent MySQL to read this file.
      // Remember to check /etc/apparmor.d/usr.sbin.mysqld
      $options['in_file'] = '/dev/shm/infile.txt';
  }
  if (!isset($options['link_identifier'])) {
      $options['link_identifier'] = null;
  }

  // Make options local
  extract($options);

  // Validation
  if (!is_array($data)) {
      if ($trigger_errors) {
          trigger_error('First argument "queries" must be an array',
              E_USER_ERROR);
      }
      return false;
  }
  if (empty($table)) {
      if ($trigger_errors) {
          trigger_error('No insert table specified',
              E_USER_ERROR);
      }
      return false;
  }
  if (count($data) > 10000) {
      if ($trigger_notices) {
          trigger_error('It\'s recommended to use <= 10000 queries/bulk',
              E_USER_NOTICE);
      }
  }
  if (empty($data)) {
      return 0;
  }

  if (!function_exists('__exe')) {
      function __exe ($sql, $query_handler, $trigger_errors, $link_identifier = null) {
          if ($link_identifier === null) {
              $x = call_user_func($query_handler, $sql);
          } else {
              $x = call_user_func($query_handler, $sql, $link_identifier);
          }
          if (!$x) {
              if ($trigger_errors) {
                  trigger_error(sprintf(
                      'Query failed. %s [sql: %s]',
                      mysql_error($link_identifier),
                      $sql
                  ), E_USER_ERROR);
                  return false;
              }
          }

          return true;
      }
  }

  if (!function_exists('__sql2array')) {
      function __sql2array($sql, $trigger_errors) {
          if (substr(strtoupper(trim($sql)), 0, 6) !== 'INSERT') {
              if ($trigger_errors) {
                  trigger_error('Magic sql2array conversion '.
                      'only works for inserts',
                      E_USER_ERROR);
              }
              return false;
          }

          $parts   = preg_split("/[,\(\)] ?(?=([^'|^\\\']*['|\\\']" .
                                "[^'|^\\\']*['|\\\'])*[^'|^\\\']" .
                                "*[^'|^\\\']$)/", $sql);
          $process = 'keys';
          $dat     = array();

          foreach ($parts as $k=>$part) {
              $tpart = strtoupper(trim($part));
              if (substr($tpart, 0, 6) === 'INSERT') {
                  continue;
              } else if (substr($tpart, 0, 6) === 'VALUES') {
                  $process = 'values';
                  continue;
              } else if (substr($tpart, 0, 1) === ';') {
                  continue;
              }

              if (!isset($data[$process])) $data[$process] = array();
              $data[$process][] = $part;
          }

          return array_combine($data['keys'], $data['values']);
      }
  }

  // Start timer
  $start = microtime(true);
  $count = count($data);

  // Choose bulk method
  switch ($method) {
      case 'loaddata':
      case 'loaddata_unsafe':
      case 'loadsql_unsafe':
          // Inserts data only
          // Use array instead of queries

          $buf    = '';
          foreach($data as $i=>$row) {
              if ($method === 'loadsql_unsafe') {
                  $row = __sql2array($row, $trigger_errors);
              }
              $buf .= implode(':::,', $row)."^^^\n";
          }

          $fields = implode(', ', array_keys($row));

          if (!@file_put_contents($in_file, $buf)) {
              $trigger_errors && trigger_error('Cant write to buffer file: "'.$in_file.'"', E_USER_ERROR);
              return false;
          }

          if ($method === 'loaddata_unsafe') {
              if (!__exe("SET UNIQUE_CHECKS=0", $query_handler, $trigger_errors, $link_identifier)) return false;
              if (!__exe("set foreign_key_checks=0", $query_handler, $trigger_errors, $link_identifier)) return false;
              // Only works for SUPER users:
              #if (!__exe("set sql_log_bin=0", $query_handler, $trigger_error)) return false;
              if (!__exe("set unique_checks=0", $query_handler, $trigger_errors, $link_identifier)) return false;
          }

          if (!__exe("
             LOAD DATA INFILE '${in_file}'
             INTO TABLE ${table}
             FIELDS TERMINATED BY ':::,'
             LINES TERMINATED BY '^^^\\n'
             (${fields})
         ", $query_handler, $trigger_errors, $link_identifier)) return false;

          break;
      case 'transaction':
      case 'transaction_lock':
      case 'transaction_nokeys':
          // Max 26% gain, but good for data integrity
          if ($method == 'transaction_lock') {
              if (!__exe('SET autocommit = 0', $query_handler, $trigger_errors, $link_identifier)) return false;
              if (!__exe('LOCK TABLES '.$table.' READ', $query_handler, $trigger_errors, $link_identifier)) return false;
          } else if ($method == 'transaction_keys') {
              if (!__exe('ALTER TABLE '.$table.' DISABLE KEYS', $query_handler, $trigger_errors, $link_identifier)) return false;
          }

          if (!__exe('START TRANSACTION', $query_handler, $trigger_errors, $link_identifier)) return false;

          foreach ($data as $query) {
              if (!__exe($query, $query_handler, $trigger_errors, $link_identifier)) {
                  __exe('ROLLBACK', $query_handler, $trigger_errors, $link_identifier);
                  if ($method == 'transaction_lock') {
                      __exe('UNLOCK TABLES '.$table.'', $query_handler, $trigger_errors, $link_identifier);
                  }
                  return false;
              }
          }

          __exe('COMMIT', $query_handler, $trigger_errors, $link_identifier);

          if ($method == 'transaction_lock') {
              if (!__exe('UNLOCK TABLES', $query_handler, $trigger_errors, $link_identifier)) return false;
          } else if ($method == 'transaction_keys') {
              if (!__exe('ALTER TABLE '.$table.' ENABLE KEYS', $query_handler, $trigger_errors, $link_identifier)) return false;
          }
          break;
      case 'none':
          foreach ($data as $query) {
              if (!__exe($query, $query_handler, $trigger_errors, $link_identifier)) return false;
          }

          break;
      case 'delayed':
          // MyISAM, MEMORY, ARCHIVE, and BLACKHOLE tables only!
          if ($trigger_errors) {
              trigger_error('Not yet implemented: "'.$method.'"',
                  E_USER_ERROR);
          }
          break;
      case 'concatenation':
      case 'concat_trans':
          // Unknown bulk method
          if ($trigger_errors) {
              trigger_error('Deprecated bulk method: "'.$method.'"',
                  E_USER_ERROR);
          }
          return false;
          break;
      default:
          // Unknown bulk method
          if ($trigger_errors) {
              trigger_error('Unknown bulk method: "'.$method.'"',
                  E_USER_ERROR);
          }
          return false;
          break;
  }

  // Stop timer
  $duration = microtime(true) - $start;
  $qps      = round ($count / $duration, 2);

  if ($eat_away) {
      $data = array();
  }

  @unlink($options['in_file']);

  // Return queries per second
  return $qps;
}
?>
```

## Using the Function

The `mysqlBulk` function supports a couple of methods.

### Array Input With Method: Loaddata (Preferred)

What would really give it wings, is if you can supply the data as an array.
That way I won't have to translate your raw queries to arrays, before I can
convert them back to CSV format. Obviously skipping all that conversion saves
a lot of time.

```php
<?php
$data   = array();
$data[] = array('level' => 'err', 'msg' => 'foobar!');
$data[] = array('level' => 'err', 'msg' => 'foobar!');
$data[] = array('level' => 'err', 'msg' => 'foobar!');

if (false === ($qps = mysqlBulk($data, 'log', 'loaddata', array(
    'query_handler' => 'mysql_query'
)))) {
    trigger_error('mysqlBulk failed!', E_USER_ERROR);
} else {
    echo 'All went well @ '.$qps. ' queries per second'."n";
}
?>
```

Most of the time it's even easier cause you don't have to write queries.

### SQL input with method: loadsql\_unsafe

If you can really only deliver raw insert queries, use the loadsql\_unsafe
method. It's unsafe because I convert your queries to arrays on the fly. That
also makes it 10 times slower (still twice as fast as other methods).

This is what the basic flow could look like:

```php
<?php
$queries   = array();
$queries[] = "INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')";
?>
```

```php
<?php
$queries[] = "INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')";
?>
```

```php
<?php
$queries[] = "INSERT INTO `log` (`level`, `msg`) VALUES ('err', 'foobar!')";
?>
```

```php
<?php
if (false === ($qps = mysqlBulk($queries, 'log', 'loadsql_unsafe', array(
    'query_handler' => 'mysql_query'
)))) {
    trigger_error('mysqlBulk failed!', E_USER_ERROR);
} else {
    echo 'All went well @ '.$qps. ' queries per second'."n";
}
?>
```

### Safe SQL Input With Method: Transaction

Want to do a Transaction?

```php
<?php
mysqlBulk($queries, 'transaction');
?>
```

### Options

Change the `query_handler` from `mysql_query` to your actual query function.
If you have a DB Class with an `execute()` method, you will have to
encapsulate them inside an array like this:

```php
<?php
$db = new DBClass();
mysqlBulk($queries, 'log', 'none', array(
    'query_handler' => array($db, 'execute')
);
// Now your $db->execute() function will actually
// be used to make the real MySQL calls
?>
```

Don't want mysqlBulk to produce any errors? Use the `trigger_errors` option.

```php
<?php
mysqlBulk($queries, 'log', 'transaction', array(
    'trigger_errors' => false
);
?>
```

Want mysqlBulk to produce notices? Use the `trigger_notices` option.

```php
<?php
mysqlBulk($queries, 'log', 'transaction', array(
    'trigger_notices' => true.
);
?>
```

Have ideas on this? Leave me a comment.

## Benchmark Details - What Did I Use?

Of course solid benching is very hard to do and I [already failed once](/blog/2009/03/31/boost-mysql-performance-by-1200/).
This is what I used.

### Table Structure

I created a small table with some indexes & varchars. Here's the structure
dump:

```sql
--
-- Table structure for table `benchmark_data`
--

CREATE TABLE `benchmark_data` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `user_id` smallint(5) unsigned NOT NULL,
  `a` varchar(20) NOT NULL,
  `b` varchar(30) NOT NULL,
  `c` varchar(40) NOT NULL,
  `d` varchar(255) NOT NULL,
  `e` varchar(254) NOT NULL,
  `created` timestamp NOT NULL default CURRENT_TIMESTAMP,
  PRIMARY KEY  (`id`),
  KEY `a` (`a`,`b`),
  KEY `user_id` (`user_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1;
```

### Table Data

I filled the table with ~2,846,799 records containing random numbers & strings of variable length. No 1000 records are the same.

### Machine

I had the following configuration to benchmark with:

```bash
Product Name: PowerEdge 1950
Disks: 4x146GB @ 15k rpm in RAID 1+0
Memory Netto Size: 4 GB
CPU Model: Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
Operating System: Ubuntu 8.04 hardy (x86_64)
MySQL: 5.0.51a-3ubuntu5.4
PHP: 5.2.4-2ubuntu5.5
```

- *Provided by [True.nl](https://www.true.nl/)*

Thanks to Erwin Bleeker for pointing out that my [initial benchmark was shit](/blog/2009/03/31/boost-mysql-performance-by-1200/).

### Finally

This is my second benchmark so if you have some pointers that could improve my next: I'm listening.
]]></content:encoded>
      <dc:date>2009-03-31T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Boost MySQL Performance by 1200%</title>
      <link>https://kvz.io/boost-mysql-performance-by-1200.html</link>
      <description><![CDATA[Sorry folks, this article was based on flawed benchmark results, I will soon
post an update!
]]></description>
      <pubDate>Tue, 31 Mar 2009 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/boost-mysql-performance-by-1200.html</guid>
      <content:encoded><![CDATA[Sorry folks, this article was based on flawed benchmark results, I will soon
post [an update](/blog/2009/03/31/improve-mysql-insert-performance/)!

<!--more-->

Please goto:

- [Improve MySQL Insert Performance](/blog/2009/03/31/improve-mysql-insert-performance/)
]]></content:encoded>
      <dc:date>2009-03-31T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>The Pragmatic SQL Style Guide</title>
      <link>https://kvz.io/sql-formatting.html</link>
      <description><![CDATA[Code spends more time being read then being written. I think naturally this is true for queries as well. So it might help if we teach ourselves some
guidelines as how to nicely format them.
]]></description>
      <pubDate>Wed, 04 Mar 2009 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/sql-formatting.html</guid>
      <content:encoded><![CDATA[Code spends more time being [read then being written](https://www.codinghorror.com/blog/archives/000684.html). I think naturally this is true for queries as well. So it might help if we teach ourselves some
guidelines as how to nicely format them.

I've searched but at the time of writing, could not find a public Style Guide for SQL formatting. I'll try to keep this guide short and pragmatic, so that it has a chance of actually being read & stuck to : )

<!--more-->

This targets MySQL but should work as well for other dialects/engines.

We have a couple of tools at our exposal inside SQL to format our queries.

- **SQL whitespace** is ignored by MySQL's parser, so let's use it to make
  our lives easier, right?
- **Dummy conditions** have no effect in SQL statements. Things like: `WHERE 1`.

Ok let's just begin with some of the SQL layout habits we settled on at our company.

## All Clauses Get a Newline

This will provide a clear seperation of different parts of the query, making
it more easy to read & comprehend. And it will enable you to better maintain
the query cause you can jump between different clauses bij just pressing up &
down.

### Before

```sql
SELECT * FROM `books` ORDER BY `title`
```

### After

```sql
SELECT *
FROM `books`
ORDER BY `title`
```

N.b.: Whitespace is ignored by MySQL's parser, but
[not by MySQL's query cache on versions &lt;5 ](https://www.mysqlperformanceblog.com/2008/03/20/mysql-query-cache-whitespace-and-comments/),
thanks to 'foobar' for pointing that out. So if you want query cache to pay off.. Pick one format & stick with it ; )

## Accompanied Fieldnames Get Their Own Newline

This will allow you to very easily (or even dynamically) add or remove certain
fields from the query.

### Before

```sql
SELECT `id`, `title`, `rating`
FROM `books`
ORDER BY `title`
```

### After

```sql
SELECT
  `id`,
  `title`,
  `rating`
FROM `books`
ORDER BY `title`
```

## Lonely Fieldnames Stay on the Same Line

This will keep the query more or less compact and avoid that even the simplest
query takes up 10 lines.

### Before

```sql
SELECT
  `id`,
  `title`
FROM
  `books`
ORDER BY
  `title`
```

### After

```sql
SELECT
  `id`,
  `title`
FROM `books`
ORDER BY `title`
```

## Where 1

So now that we've played with whitespace a bit, there are other things that
have no effect in SQL as well. Like `1`. This short expression can really help
us define conditions in a more uniform way.

Let's look at an example.

### Before

```sql
SELECT `id`
FROM `books`
WHERE
  `published` = 'yes'
  AND `rating` > 5`
```

Hm, too bad, `published = 'yes'` is formatted differently from `rating > 5`
because it doens't have the `AND` word. We'd have to account for that every
time we change the conditions, or if we were generating this query
automatically: 'Always prefix with AND... **UNLESS** it's the first
condition'. Bad for layout. Bad for automation.

### After

```sql
SELECT `id`
FROM `books`
WHERE 1
  AND `published` = 'yes'
  AND `rating` > 5`
```

One other big advantage of writing your conditions in such a uniform format,
is that it becomes really easy to conditionally add &mdash; or 
**temporarily turn off** conditions with MySQL Comments:

```sql
SELECT `id`
FROM `books`
WHERE 1
  -- AND `published` = 'yes'
  AND `rating` > 5`
```

..without breaking SQL syntax. Cause remember that if I had done this with the
**before** query, I would have gotten a syntax error.

### The Performance Hit of 1

While this dummy syntax provides developers with some great comfort, of course
we have to **make sure** this addition doesn't come at a price. MySQL guru
[Erwin Bleeker](https://www.google.com/reader/shared/02111691477068722848)
benchmarked on multiple occasions with 1 bilion queries (no
cache), to find that the results only differed by one hundreds of a second on
average.

This is how he benchmarked:

```bash
mysql> select benchmark(1000000000, (select SQL_NO_CACHE 1 from employees WHERE 1 limit 1));
+-----------------------------------------------------------------------------------+
| benchmark(1000000000, (select SQL_NO_CACHE 1 from medewerkers WHERE 1 limit 1)) |
+-----------------------------------------------------------------------------------+
| 0                                         |
+-----------------------------------------------------------------------------------+


1 row in set (24.84 sec)


mysql> select benchmark(1000000000, (select SQL_NO_CACHE 1 from employees limit 1));
+-------------------------------------------------------------------------+
| benchmark(1000000000, (select SQL_NO_CACHE 1 from medewerkers limit 1)) |
+-------------------------------------------------------------------------+
| 0                                     |
+-------------------------------------------------------------------------+


1 row in set (24.83 sec)
```

## Where 0

As you may have guessed, our dummy condition also works for `OR` queries. Just
negate the dummy condition: `0`, look:

```sql
SELECT `id`
FROM `books`
WHERE 0
  OR `published` = 'yes'
  OR `rating` > 5`
```

## Joins

Let me just show you how I go about this:

```sql
SELECT
  `authors`.`id`,
  `authors`.`name`,
  `authors`.`birthday`,
  COUNT(`books`.`id`) AS book_pub_cnt
FROM `authors`
LEFT JOIN `books` ON (1
  AND `books`.`author_id` = `authors`.`id`
  AND `books`.`published` = 'yes'
)
WHERE 1
  AND `authors`.`alive` = 'yes'
GROUP BY `authors`.`id`
ORDER BY `authors`.`name`
```

## Furthermore

You may have noticed that I:

### Use Backticks to Enclose All Database Entities

That's just good habit, this way if you ever have an ambiguous fieldname
('active', or 'status', could be interpretted as statements), your database
will know that you mean the fieldname or table, and not the statement.

### Use Single Quotes to Enclose Strings

So you can enclose the entire query in double quotes, and be able to use
variables from PHP without concatenation. On the other hand: You should really
use prepared statements, and in my eyes concatination is better than let PHP
automatically substitute your vars. But hey, if you need to choose anyway,
might as well be single quotes.

### Nest with Two Spaces vs Tabs

I tend to do this in PHP & other languages as well. It allows for consistent layout in
all possible editors & views. The tab character sometimes also have unwanted (e.g. autocomplete)
impact when you paste into a console. There's an interesting post about it
[here](https://www.jwz.org/doc/tabs-vs-spaces.html).

## Conventions in General

Sometimes conventions rely more on taste than reason. Still it's helpful to settle on a single 'taste':

> It's irrelevant if people drive on the right or left side of the road. As long as they all do the same :)

While I'm not implying there will be fatal accidents if many different styles are deployed, 
it will contribute to the success of your project if you are able to agree on coding standards, 
and SQL is no exception.

This is version `1.2.0` of this style guide. I'm happy to hear what you're using, or if you'd like to see additions, just leave a comment so I can update accordingly! Sadly when moving to a new comment system many comments where lost, but we started at `0.0.1` so quite some improvements have already been contributed. Thank you!

]]></content:encoded>
      <dc:date>2009-03-04T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>7 Steps to Better PEAR Documentation</title>
      <link>https://kvz.io/7-steps-to-better-pear-documentation.html</link>
      <description><![CDATA[If you've written a PEAR package, it's probably a good idea to submit
some end user documentation. Here's how to do it.
]]></description>
      <pubDate>Sun, 22 Feb 2009 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/7-steps-to-better-pear-documentation.html</guid>
      <content:encoded><![CDATA[If you've written a [PEAR](https://pear.php.net/) package, it's probably a good idea to submit
some end user documentation. Here's how to do it.

<!--more-->

## How It Works

- PEAR documentation is stored in XML format in a CVS repository.
- There's a tool called \_phd \_that can convert this XML documentation to
  many formats: HTML, PDF, CHM, etc.
- The PEAR website periodically checks out the XMLs and uses phd to build
  all required documentation.
- The language: 'en' is leading for all other languages. So make your
  changes there.

So basically you:

- Save the current documentation
- Add your own
- Build & review the documentation locally
- Submit your addition (with CVS)

## In This article..

I'm going to use

- *[System\_Daemon](https://pear.php.net/package/System-Daemon/)* as example PEAR package.
- `kvz` as CVS username
- `kevin` as workstation username
- `~/workspace` as project location
- Ubuntu as Operating system.

So please just substitute these specifics with your own.

## 1. Prerequisites

### Install phd

```bash
$ pear install Console_CommandLine
$ pear install Console_Getopt
$ pear channel-discover doc.php.net
$ pear install doc.php.net/phd-beta
```

I also needed to up my memory limit (I set it to `128M`) in
`/etc/php5/cli/php.ini`

### Install CVS

In case of Ubuntu:

```bash
$ aptitude install cvs
```

### Request Access

If you're planning on maintaining these docs yourself.. You need write-access
to the CVS repository.

- Request CVS karma for peardoc (you'd best ask someone on #PEAR at EFnet
  IRC)
- [Request](https://www.php.net/cvs-php.php) CVS access

## 2. Save Current Documentation

Let's checkout the peardoc folder of the CVS repository

At first you should try to get the pear manual compiled

```bash
$ su kevin
$ cd ~/workspace
$ cvs -d :pserver:kvz@cvs.php.net:/repository login peardoc
$ cvs -d :pserver:kvz@cvs.php.net:/repository checkout peardoc
$ cd peardoc
```

## 3. Try Building the Docs

Let's just see if everything works:

```bash
$ php configure.php
# The command configure.php told you to execute, in my case:
$ phd -L en -f xhtml -t pearchunkedhtml -o build/en -d .manual.xml
```

**If this fails, fix it first.**

## 4. Write Your Own XML Docs

And so in this case the path to store my different chapters as XMLs would be:

```bash
en/package/system/system-daemon/*.xml
```

Also a 'homepage' of your docs can be placed here:

```bash
en/package/system/system-daemon.xml
```

### What Should the XML Look Like?

I've just looked at other packages like Console\_Table, and used them as
example.

## 5. ReBuild Peardoc

Again, build the docs, and checkout the generated HTML output stored in `build/en/html/index.html`

You can just point your browser to e.g.:
file:///home/kevin/workspace/peardoc/build/ and follow the path to your
addition.

Well?

- Looks good? **Goto step 6**
- Looks miserable? **Goto step 4**

## 6. Commit Your XML

```bash
$ cvs add en/package/system/system-daemon/*.xml
$ cvs add en/package/system/system-daemon.xml
$ cvs commit
```

## 7. There Is No Step 7

Thanks to [Christian Weiske](https://cweiske.de) for holding my hand while I was taking these
first steps ; )
]]></content:encoded>
      <dc:date>2009-02-22T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Post Flood</title>
      <link>https://kvz.io/post-flood.html</link>
      <description><![CDATA[Hello everyone. Two days ago Feedburner offered me to merge my account with
Google. I thought: why not. But apparently now the URL of my feeds changed.
This messed up my stats, and your RSS reader has marked all of my posts as
unread. I'm very sorry for the inconvenience.
]]></description>
      <pubDate>Tue, 03 Feb 2009 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/post-flood.html</guid>
      <content:encoded><![CDATA[Hello everyone. Two days ago Feedburner offered me to merge my account with
Google. I thought: why not. But apparently now the URL of my feeds changed.
This messed up my stats, and your RSS reader has marked all of my posts as
unread. I'm very sorry for the inconvenience.

<!--more-->
]]></content:encoded>
      <dc:date>2009-02-03T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>A DRY Piece of Cake</title>
      <link>https://kvz.io/a-dry-piece-of-cake.html</link>
      <description><![CDATA[So I've been learning CakePHP the last few days. Bit by bit I've been
trying to port a lecagy admininistration app to Cake. 'Secretly' linking
menuitems to finished Cake parts as we go. And I must say: I'm pretty excited.
I did run into a disturbing conclusion though. I estimated the legacy app will
have over 300 Models &amp; Controllers once finished. That could easily add up to
(300 x 4 =) 1200 views. And here I am, creating a maintenance hell while
trying to solve one!
]]></description>
      <pubDate>Fri, 30 Jan 2009 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/a-dry-piece-of-cake.html</guid>
      <content:encoded><![CDATA[So I've been learning [CakePHP](/categories/cakephp/) the last few days. Bit by bit I've been
trying to port a lecagy admininistration app to Cake. 'Secretly' linking
menuitems to finished Cake parts as we go. And I must say: I'm pretty excited.
I did run into a disturbing conclusion though. I estimated the legacy app will
have over 300 Models & Controllers once finished. That could easily add up to
(300 x 4 =) 1200 views. And here I am, creating a [maintenance hell](https://www.codinghorror.com/blog/archives/000878.html) while
trying to solve one!

<!--more-->

Most of my views are **very similar**. They're all about managing data in a
table. Kind of like phpmyadmin but more from a business point-of-view. So I
thought: Why not share views between different controllers.

Note that I'm** not **working on your **average 2.0 site** here so your
mileage may vary. **Extremely**.

## Don't duplicate views. Share them

This approach basically allows me to only create models & controllers, but no

views. Still, I have the flexibility to create custom views for the remaining

- let's say - 30 special cases. I just create a view file in the same
  directory as I would normally: Cake would find it, and use it instead of
  falling back on the default view. Simple right?

To achieve that, **these were my steps**.

*N.B. The code that I provide here are stripped versions. I don't want to
bother you with my specifics.*

### 1. Enrich Models

Added information to the models explaining what fields are important, if they
have special rules. They will be interpreted by the `AppController` and then
fed to the `default` views in the form of a `$showFields` array.

### 2. AppController Fallback

This is the most important. We need to tell the `AppController` to switch to
the `default` view if a custom view cannot be found:

```php
<?php
class AppController extends Controller {
    /**
     * Set up all required view data just before viewing
     */
    function beforeRender() {
        /*
         * Switch to default view if specific controller view cannot be
         * translates: app/views/dev_issues/index.ctp [NOT FOUND!]
         * to:         app/views/default/index.ctp
         *
         * Enabling you to only create views if you have very deviating ones, and have all
         * standard objects handled by a dynamic default view.
         */

        $viewPath = reset(Configure::read('viewPaths'));
        $subDir   = isset($this->subDir) && !is_null($this->subDir) ? $this->subDir . DS : null;
        $name     = $this->action;
        $name     = str_replace('/', DS, $name);
        $name     = $this->viewPath . DS . $subDir . Inflector::underscore($name);
        $path     = $viewPath . $name . $this->ext;
        if (!file_exists($path)) {
            // Change the viewPath so it will now try to load $this->action
            // from the 'default' directory
            $this->viewPath = 'default';
        }

        // Next I retrieve some Model->schema() information, enrich it,
        // and then I provide it to the view like so: $this->set('showFields', $showFields);
        // then the the 'default' views can iterate it and do some magic
    }
?>
```

(parts of this code were taken from the Cake core, and may need to be
optimised for this situation)

### 3. Create Default Views

Of course we can't skip this step. I created an '`/app/views/default`'
directory, and some common actions. Just as if `default` were a normal
controller, and so we have the following layout:

```bash
/views
  /default
    index.ctp
    view.ctp
    add.ctp
    edit.ctp
```

I Added code to those views so they can dynamically iterate through any
imporant Model fields they encounter in `$showFields` which is set by the
`AppController`. For instance, my `edit.ctp` contains the following code:

```php
<?php
// Iterate through all the fields that are interesting in UI
// the $showFields is basically a modified Model->schema(), set by the
?>
```

AppController

```php
<?php
foreach ($showFields as $fieldName=>$fieldData) {
    echo $form->input($fieldName, $fieldData['formOptions']);
}

// All important fields will be automagically drawn
?>
```

### 4. Resolve Helper

To avoid adding too much logic to views, I store some logic in helpers.

For example I created the `ResolveHelper`, it's called from views, and will
translate any field value to a more meaningful one. This is done according to
rules in the Model like it's relation to parent Models.

## Fat AppController

Because my Models are so similar there will be 300 Controllers, each having at
least their own `index`, `view`, `edit` & `add` method:

```php
<?php
Class DevIssuesController extends AppController {

    function index() {

    }

    function view($id = null) {

    }

    function edit($id = null) {

    }

    function add() {

    }

}
?>
```

Now that's a lot of duplication right there. 300 `index()` methods? 300
`add()` methods? I don't think so.

So I moved out the most basic Controller methods (`index`, `view`, and the
like) to the `app_controller.php` file, which in fact, all other controllers
are based on.

This actually behaves in the same way as shared views. Basically:
`AppController->index()` will be called for every Controller **unless** you
specifically set one like `DevIssuesController->index()`. This is just OOP
nature that we take advantage of.

So basically I only add some common methods to my `AppController` and be done
with it. For the 30 Controllers that are too exceptional to be served by these
common methods, I can still write custom methods inside them. I can even call
`parent::index()`; to still make use of the `AppController` logic, and expand
the method with additional logic as well.

Of course these common methods will need some extra routines so they can
handle the common tasks decoupled from specific models. Keys in this approach
are:

- Allow for some configuration in your Models
- Stick to conventions

In my case, I ended up with a common `edit` method something like this:

```php
<?php
function edit($id = null) {
    if (empty($this->MyModel)) {
        $this->Session->setFlash(__('Generic AppController->edit cannot be' .
            'used for this Controller. Please create a method: '.
                $this->name.'->'.__FUNCTION__.'()', true));
    }

    if (!$id && empty($this->data)) {
        $this->Session->setFlash(__('Invalid ' . $this->MyModel->title . '', true));
        return $this->redirect(array('action'=>'index'));
    }
    if (!empty($this->data)) {
        if ($this->MyModel->save($this->data)) {
            $this->Session->setFlash(__('The '.$this->MyModel->title.' has been saved', true));
            $this->redirect(array('action'=>'index'));
        } else {
            $this->Session->setFlash(__('The '.$this->MyModel->title.'could not be saved. Please, try again.', true));
        }
    }

    // Set Main data
    if (empty($this->data)) {
        $this->data = $this->MyModel->read(null, $id);
    }

    $data = array();

    $this->set($data);
}
?>
```

I may need to expand it a bit in the future to enable support for dependending
Models, etc. But that's all pretty straight forward. And the point is: I will
only need to make these changes in **one** place. And all other 300 objects
could profit from it.

## What About Baking?

Baking is really good if you stick to Cake's conventions.

If you're working with legacy data like me, you may define things like
`$useTable`, `$primaryKey`, and `$foreignKey`. But you may found that these
properties are pretty much ignored by the baking process.

Resulting in text fields where you would expect select boxes, and id: '21032'
where you would expect: 'Kevin'.

Besided, I would still end up with 1200+ views that I cannot change from that
point forward, because all would be lost on the next bake. Also, the views may
respond differently depending on environment variables like logged in user.

## What About Scaffolding?

Well scaffolding is not at all recommended for production use, and I feel my
current approach gives me more flexibility. Because by adding a bit of
configuration to models I really have fine-grained control over how the views
behave for different models. Reducing the need for exceptions. Reducing the
need for duplication.

## Downsides

Some people may feel that I'm coupling [MVC](/categories/mvc/) more than I should: a Model
knows about Controller aspects (though not much, I try to use the controller
to enforce it's logic on the Model as much as possible).

While I am aware of these dangers, I feel this approach will also allow me to
create the 'legacy app' 50 times faster and so it really solves more problems
for me than it creates.

Besides, this is only true for the `default` system. Nothing stops me from
supplying Cake with additional real views, Controllers with real methods, and
they will be fully adhering to MVC.

## Conclusion

So while I don't usually like my Cake [DRY](/categories/dry/), I'm very happy with the way
this is going!

Now, though I may have some programming skills, I'm new to Cake, so I really
welcome insightful comments on all this.
]]></content:encoded>
      <dc:date>2009-01-30T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Create Daemons in PHP</title>
      <link>https://kvz.io/create-daemons-in-php.html</link>
      <description><![CDATA[Everyone knows PHP can be used to create websites. But it can also be used
to create desktop applications and commandline tools. And now with a class
called System_Daemon, you can even create daemons using nothing but PHP.
And did I mention it was easy?
]]></description>
      <pubDate>Fri, 09 Jan 2009 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/create-daemons-in-php.html</guid>
      <content:encoded><![CDATA[Everyone knows PHP can be used to create websites. But it can also be used
to create desktop applications and commandline tools. And now with a class
called System\_Daemon, you can even create daemons using nothing but PHP.
And did I mention it was easy?

<!--more-->

## Update 4 Dec, 2012: Legacy Warning

This class was relevant in 2009, and may still be to some people, but if you
want to daemonize a php script nowadays, a 5-line Ubuntu `upstart` script should suffice.
Upstart will daemonize your script and can even respawn it if it goes down. Something
that you'd otherwise have to use `monit` for.
I have another article on [daemonizing nodejs scripts with upstart](https://kvz.io/blog/2009/12/15/run-nodejs-as-a-service-on-ubuntu-karmic/), that can just as well be applied to PHP scripts.

If you're still convinced you need to do this with pure PHP, read on.

## What Is a Daemon?

A daemon is a Linux program that run in the background, just like a
*'Service'* on Windows. It can perform all sorts of tasks that do not
require direct user input. Apache is a daemon, so is MySQL. All you ever
hear from them is found in somewhere in `/var/log`, yet they silently power
over 40% of the Internet.

You reading this page, would not have been possible without them.
So clearly: a daemon is a powerful thing, and can be bend to do a lot
of different tasks.

## Why PHP?

Most daemons are written in C. It's fast & robust. But if you are in a LAMP
oriented company like me, and you need to create a lot of software in PHP
anyway, it makes sense:

- **Reuse & connect existing code**
  Think of database connections, classes that create customers from your CRM, etc.
- **Deliver new applications very fast**
  PHP has a lot of build in functions that speed up development greatly.
- **Everyone knows PHP** (right?)
  If you work in a small company: chances are there are more PHP programmers than there are C programmers. What if your C guy abandons ship? Admittedly it's a very pragmatic reason, but it's the same reason why Facebook is building [HipHop](https://github.com/facebook/hiphop-php "HipHop by Facebook").

## Possible Use Cases

- **Website optimization**
  If you're running a (large) website, jobs that do heavy lifting should be taken away from the user interface and scheduled to run on the machine separately.
- **Log parser**
  Continually scan logfiles and import critical messages in your database.
- **SMS daemon**
  Read a database queue, and let your little daemon interface with your SMS provider. If it fails, it can easily try again later!
- **Video converter** (think Youtube)
  Scan a queue/directory for incoming video uploads. Make some system calls to *ffmpeg* to finally store them as Flash video files. Surely you don't want to convert video files right after the upload, blocking the user interface that long? No: the daemon will send the uploader a mail when the conversion is done, and proceed with the next scheduled upload

## Deamons vs Cronjobs

Some people [use cronjobs](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/) for the same *Possible use cases*. Crontab is fine but it only allows you to run a PHP file every minute or so.

- What if the previous **run hasn't finished** yet? Overlap can seriously damage your data & cause siginificant load on your machines.
- What if you can't afford to wait a minute for the cronjob to run? Maybe you need to trigger a process **the moment** a record is inserted?
- What if you want to keep track of multiple *'runs'* and store data in memory.
- What if you need to keep your application listening (on a socket for example) Cronjobs are a bit rude for this, they may spin out of control and don't provide a

**framework for logging**, etc. Creating a daemon would offer more elegance & possibilities. Let's just say: there are very good reasons why a **Linux OS isn't composed entirely of Cronjobs** :)

## How It Works Internally

(*Nerd alert!*) When a daemon program is started, it fires up a second
child process, detaches it, and then the parent process dies. This is
called *forking*. Because the parent process dies, it will give the
console back and it will look like nothing has happened. **But wait**:
the child process is still running. Even if you close your terminal,
the child continues to run in memory, until it either stops, crashes
or is killed.

In PHP: forking can be achieved by using the [Process Control Extensions](https://www.php.net/manual/en/book.pcntl.php).
Getting a good grip on it, may take some studying though.

## System\_Daemon

Because the [Process Control Extensions](https://www.php.net/manual/en/book.pcntl.php)' documentation is a bit rough,
I decided to figure it out once, and then wrap my knowledge and the required
code inside a [PEAR](/categories/pear/) class called: [System\_Daemon](https://pear.php.net/package/System-Daemon).
And so now you can just:

```php
<?php
require_once "System/Daemon.php";                 // Include the Class

System_Daemon::setOption("appName", "mydaemon");  // Minimum configuration
System_Daemon::start();                           // Spawn Deamon!
?>
```

And that's all there is to it. The code after that, will run in your
server's background. So next, if you create a `while(true)` loop around
that code, the code will run indefinitely. Remember to build in a `sleep(5)`
to ease up on system resources.

### Features & Characteristics

Here's a grab of System\_Daemon's features:

- Daemonize any PHP-CLI script
- Simple syntax
- Driver based Operating System detection
- Unix only
- Additional features for Debian / Ubuntu based systems like:
- Automatically generate startup files (init.d)
- Support for PEAR's Log package
- Can run with PEAR (more elegance & functionality) or without PEAR (for standalone packages)
- Default signal handlers, but optionally reroute signals to your own handlers.
- Set options like max RAM usage

### Download

You could [download the package](https://pear.php.net/package/System-Daemon/download) from PEAR, or, if you have PEAR installed
on your system: just run this from a console:

```bash
$ pear install -f System_Daemon
```

You can also update it using that last command.

### Beta

Though I have quite some daemons set up this way, it's officially
still beta. So please [report any bugs](https://pear.php.net/bugs/search.php?cmd=display&package-name[0]=System-Daemon) over at the PEAR page.
Other comments may be posted here.

## Complex Example

Ready to dig a little deeper? This example program is called 'logparser',
it takes a look at a more complex use of System\_Daemon. For instance, it
introduces command line arguments like:

```bash
--no-daemon              # just run the program in the console this time
--write-initd            # writes a startup file
```

Read this source to get the picture. Don't forget the comments!

```php
#!/usr/bin/php -q
<?php
/**
 * System_Daemon turns PHP-CLI scripts into daemons.
 *
 * PHP version 5
 *
 * @category  System
 * @package   System_Daemon
 * @author    Kevin <kevin@vanzonneveld.net>
 * @copyright 2008 Kevin van Zonneveld
 * @license   https://www.opensource.org/licenses/bsd-license.php
 * @link      https://github.com/kvz/system_daemon
 */

/**
 * System_Daemon Example Code
 *
 * If you run this code successfully, a daemon will be spawned
 * but unless have already generated the init.d script, you have
 * no real way of killing it yet.
 *
 * In this case wait 3 runs, which is the maximum for this example.
 *
 *
 * In panic situations, you can always kill you daemon by typing
 *
 * killall -9 logparser.php
 * OR:
 * killall -9 php
 *
 */

// Allowed arguments & their defaults
$runmode = array(
    'no-daemon' => false,
    'help' => false,
    'write-initd' => false,
);

// Scan command line attributes for allowed arguments
foreach ($argv as $k=>$arg) {
    if (substr($arg, 0, 2) == '--' && isset($runmode[substr($arg, 2)])) {
        $runmode[substr($arg, 2)] = true;
    }
}

// Help mode. Shows allowed argumentents and quit directly
if ($runmode['help'] == true) {
    echo 'Usage: '.$argv[0].' [runmode]' . "\n";
    echo 'Available runmodes:' . "\n";
    foreach ($runmode as $runmod=>$val) {
        echo ' --'.$runmod . "\n";
    }
    die();
}

// Make it possible to test in source directory
// This is for PEAR developers only
ini_set('include_path', ini_get('include_path').':..');

// Include Class
error_reporting(E_STRICT);
require_once 'System/Daemon.php';

// Setup
$options = array(
    'appName' => 'logparser',
    'appDir' => dirname(__FILE__),
    'appDescription' => 'Parses vsftpd logfiles and stores them in MySQL',
    'authorName' => 'Kevin van Zonneveld',
    'authorEmail' => 'kevin@vanzonneveld.net',
    'sysMaxExecutionTime' => '0',
    'sysMaxInputTime' => '0',
    'sysMemoryLimit' => '1024M',
    'appRunAsGID' => 1000,
    'appRunAsUID' => 1000,
);

System_Daemon::setOptions($options);

// This program can also be run in the forground with runmode --no-daemon
if (!$runmode['no-daemon']) {
    // Spawn Daemon
    System_Daemon::start();
}

// With the runmode --write-initd, this program can automatically write a
// system startup file called: 'init.d'
// This will make sure your daemon will be started on reboot
if (!$runmode['write-initd']) {
    System_Daemon::info('not writing an init.d script this time');
} else {
    if (($initd_location = System_Daemon::writeAutoRun()) === false) {
        System_Daemon::notice('unable to write init.d script');
    } else {
        System_Daemon::info(
            'sucessfully written startup script: %s',
            $initd_location
        );
    }
}

// Run your code
// Here comes your own actual code

// This variable gives your own code the ability to breakdown the daemon:
$runningOkay = true;

// This variable keeps track of how many 'runs' or 'loops' your daemon has
// done so far. For example purposes, we're quitting on 3.
$cnt = 1;

// While checks on 3 things in this case:
// - That the Daemon Class hasn't reported it's dying
// - That your own code has been running Okay
// - That we're not executing more than 3 runs
while (!System_Daemon::isDying() && $runningOkay && $cnt <=3) {
    // What mode are we in?
    $mode = '"'.(System_Daemon::isInBackground() ? '' : 'non-' ).
        'daemon" mode';

    // Log something using the Daemon class's logging facility
    // Depending on runmode it will either end up:
    //  - In the /var/log/logparser.log
    //  - On screen (in case we're not a daemon yet)
    System_Daemon::info('{appName} running in %s %s/3',
        $mode,
        $cnt
    );

    // In the actuall logparser program, You could replace 'true'
    // With e.g. a  parseLog('vsftpd') function, and have it return
    // either true on success, or false on failure.
    $runningOkay = true;
    //$runningOkay = parseLog('vsftpd');

    // Should your parseLog('vsftpd') return false, then
    // the daemon is automatically shut down.
    // An extra log entry would be nice, we're using level 3,
    // which is critical.
    // Level 4 would be fatal and shuts down the daemon immediately,
    // which in this case is handled by the while condition.
    if (!$runningOkay) {
        System_Daemon::err('parseLog() produced an error, '.
            'so this will be my last run');
    }

    // Relax the system by sleeping for a little bit
    // iterate also clears statcache
    System_Daemon::iterate(2);

    $cnt++;
}

// Shut down the daemon nicely
// This is ignored if the class is actually running in the foreground
System_Daemon::stop();
?>
```

## Console Action: Controlling the Daemon

Now that we've created an example daemon, it's time to fire it up!
I'm going to assume the name of your daemon is *logparser*. This can
be changed with the statement:
`System_Daemon::setOption('appName', 'logparser')`.
But the name of the daemon is very important because it is also used
in filenames (like the logfile).

### Execute

Just make your daemon script executable, and then execute it:

```bash
$ chmod a+x ./logparser.php
$ ./logparser.php
```

### Check

Your daemon has no way of communicating through your console,
so check for messages in:

```bash
$ tail /var/log/logparser.log
```

And see if it's still running:

```bash
$ ps uf -C logparser.php
```

### Kill

Without the *start/stop files* (see below for howto), you need to:

```bash
$ killall -9 logparser.php
```

Autch.. Let's work on those start / stop files, right?

### Start / Stop Files (Debian & Ubuntu Only)

Real daemons have an `init.d` file. Remember you can restart Apache with
the following statement?

```bash
$ /etc/init.d/apache2 restart
```

That's a lot better than killing. So you should be able to control your
own daemon like this as well:

```bash
$ /etc/init.d/logparser stop
$ /etc/init.d/logparser start
```

Well with System\_Daemon you can write autostartup files using the
`writeAutoRun()` method, look:

```php
<?php
$path = System_Daemon::writeAutoRun();
?>
```

On success, this will return the path to the autostartup file:
`/etc/init.d/logparser`, and you're good to go!

### Run on Boot

Surely you want your daemon to run at system boot..
So on Debian & Ubuntu you could type:

```bash
$ update-rc.d logparser defaults
```

Done your daemon now starts every time your server boots.
Cancel it with:

```bash
$ update-rc.d -f logparser remove
```

## Logrotate

[Igor Feghali](https://pear.php.net/user/ifeghali) shares with us a logrotate config file to keep your
log files from growing too large. Just place a file in your
`/etc/logrotate.d/`.

```bash
/var/log/mydaemon.log {
   rotate 15
   compress
   missingok
   notifempty
   sharedscripts
   size 5M
   create 644 mydaemon_user mydaemon_group
   postrotate
       /bin/kill -HUP `cat /var/run/mydaemon/mydaemond.pid 2>/dev/null` 2> /dev/null || true
   endscript
}
```

Obviously, replace the `mydaemon` occurances with values corresponding
to your environment. Thanks Igor!

## Troubleshooting

Here are some issues you may encounter down the road.

- **Don't use echo()**
  Echo writes to the STDOUT of your current session. If you logout, that will cause fatal errors and the daemon to die. Use `System_Daemon::log()` instead.
- **Connect to MySQL after you `start()` the daemon.**
  Otherwise only the parent process will have a MySQL connection, and since that dies.. It's lost and you will get a 'MySQL has gone away' error.
- **Error handling**
  Good error handling is imperative. Daemons are often mission critical applications and you don't want an uncatched error to bring it to it's knees.
- **Reconnect to MySQL**
  A connection may be interrupted. Think about network downtime or lock-ups when your database server makes backups. Whatever the cause: You don't want your daemon to die for this, let it try again later.
- **PHP error handler**
  As of [0.6.3](https://pear.php.net/package/System-Daemon/download/0.6.3), System\_Daemon forwards all PHP errors to the log() method, so keep an eye on your logfile. This behavior can be controlled using the `logPhpErrors` (true||false) option.
- **Monit**
  Monit is a standalone program that can kickstart any daemon, based on your parameters. Should your daemon fail, monit will mail you and try to restart it.
- **Watch that memory**
  Some classes keep a history of executed commands, sent mails, queries, whatever. They were designed without knowing they would ever be used in a daemonized environment.
  Cause daemons run indefinitely this 'history' will expand indefinitely. Since unfortunately your server's RAM is not infinite, you will run into problems at some point.
  This makes it's very important to address these memory 'leaks' when building daemons.
- **Statcache will corrupt your data**
  If you do a `file_exists()`, PHP remembers the results to ease on your disk until the process end. That's ok but since the Daemon process does not end, PHP will not be able to give you up to date information. As of [0.8.0](https://pear.php.net/package/System-Daemon/download/0.8.0) you should call `System_Daemon::iterate(2)` instead of e.g. `sleep(2)`, this will sleep & clear the cache and give you fresh & valid data.

I know I'm saying MySQL a lot, but you can obviously replace that with Oracle, MSSQL, PostgreSQL, etc.
]]></content:encoded>
      <dc:date>2009-01-09T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>My New IDE: NetBeans</title>
      <link>https://kvz.io/my-new-ide-netbeans.html</link>
      <description><![CDATA[Writing code requires two important things: creativity &amp; discipline. The
creativity to create the unknown, unexplored, exciting parts of software.
And the discipline to create the dull &amp; all-too-well-known parts of
software / documentation.
]]></description>
      <pubDate>Tue, 02 Dec 2008 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/my-new-ide-netbeans.html</guid>
      <content:encoded><![CDATA[Writing code requires two important things: *creativity* & *discipline*. The
**creativity** to create the unknown, unexplored, exciting parts of software.
And the **discipline** to create the dull & all-too-well-known parts of
software / documentation.

<!--more-->

You may come up with new ways (or use frameworks) to reduce repetitive work.
Effectively beating discipline with creativity. But boring stuff will still
always be there in some form. And on days when creativity is low, you may need
to tap into that jar of discipline so you can still be productive, by doing
things you never feel like.

But every now & then, there is a day when both creativity & discipline are
low. To prevent such a day from going to waste, I figure I can do 3 things:

- Drink coffee, get back at it
- Drink some more coffee, get back at it
- Use this day to learn & invest in tools & skill

Recently I had a rough night involving little sleep & one or maybe two drinks.
I'll spare you the details. The next morning my two coding fuels: creativity &
discipline were at an all-time low. I ended up repeating steps **1** & **2**,
but they just didn't cut it ; ) It was clear that this day was not going to be
my best coding day in the world ever.

So I turned to step **3** and decided to invest that day in **learning
tools**. This may not result in immediate production, but that's why it's
called investing ; ) I'm now learning & investing in things that will hopefully
increase my productivity in the future. As an added bonus: on a day otherwise
lost.

## Eclipse PDT

A programmer's primary tool is his [IDE](/categories/ide/). The end-product is in your head
but the tool will help you craft it. Good tools will help you craft it better
or faster. Or both. Readers of my blog may have noticed my **love** ([1](/blog/2008/04/11/my-new-ide-eclipse-pdt/))
**hate** ([2](/blog/2008/10/14/rescuing-my-messed-up-eclipse/)) ([3](/blog/2008/10/16/orgeclipseemfecoreutilecoreemap/)) relationship with my current IDE: [Eclipse PDT](/categories/eclipse/).
We have our ups & downs, this has never changed. Winston Churchill once said:

>  "...democracy is the worst form of government except all the others that have
> been tried"

This best illustrates how I feel:

>  "Eclipse is the worst IDE except all the others that have been tried"

But now that blog posts on a 'new' 'radical' IDE called 'NetBeans' have been
clogging up my RSS reader, and I had an entire day of coffee & learning ahead
of me, it was time to give it a try.

## NetBeans

![sqlcc.png](/assets/images/posts/sqlcc.png "sqlcc.png")NetBeans is an IDE by Sun and has been around for while. What's new in
6.5 is it's support for PHP. That's what makes it interesting for me.

The base install is no different from Eclipse PDT. Download the archive,
extract it, run it. That's it.

What complicated things with Eclipse were the steps afterwards: adding CSS
support, adding SVN support, having to struggle through their ever so poor
plugin & update system. Manually selecting mirrors & dependencies. Not to
speak of performance issues.

Not the case with NetBeans. They must have really looked at a PHP developer's
daily job, because everything you need is already in there: SVN, CVS, CSS,
**SQL**, and even support for **jQuery**! This even works within 1 document:
NetBeans figures out what's JavaScript, what's PHP, and indexes & highlights
all elements accordingly. And you can even connect to a MySQL database. This
is all out of the box.

And if a feature is missing, the NetBeans plugin system Just Works. Go ahead &
install additional features. No need for a science degree there.

Code completion is fast & accurate. Manuals are integrated. Existing Eclipse
projects can be **imported**, no need to keep separate workspace directories.
Just switch back and forth between NetBeans & Eclipse (or your other IDE of
choice) until you've made up your mind.

## Conclusion

If you want a complete list of features just check out [one](https://codeutopia.net/blog/2008/12/01/netbeans-65-review/) of many other
blog posts about NetBeans or checkout [a screencast by the creators](https://blogs.sun.com/netbeansphp/entry/demo-of-the-php-support). For
me it suffices to say: It's like a **lightweight** Eclipse, with a couple of
very powerful additional features (all out of the box), and it Just Works.

In fact, I wasn't prepared for such a smooth ride. I continued in fully
working environment, learned some keyboard shortcuts, played with the
refactoring tool (awesome). And maybe it was the coffee, but before I knew it,
my **creativity** kicked back in and I even got some serious work done that
day :) Who would have thought.

## What to Do Next

#### Not Conviced?

Pictures say more than a thousand words. So checkout some of these links:

- [SQL code completion in the PHP editor](https://blogs.sun.com/netbeansphp/entry/sql-code-completion-in-the)
- [Demo of the PHP distribution of NetBeans 6.5 - Part II](https://blogs.sun.com/netbeansphp/entry/demo-of-the-php-distribution)
- [Marking occurrences improved](https://blogs.sun.com/netbeansphp/entry/marking-types-in-php-documentation)

#### Then: Download!

Want to give it a shot? Here the [download link](https://www.netbeans.org/downloads/). You can choose your
flavour: Java SE, JavaFX, Java, Ruby, C/C++, **PHP**, or just *All*.
]]></content:encoded>
      <dc:date>2008-12-02T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Search for a Package With apt-file</title>
      <link>https://kvz.io/search-for-a-package-with-aptfile.html</link>
      <description><![CDATA[Recently I needed ogg123 on an Ubuntu server to convert some media.
Naturally, I wanted to use aptitude to install it, but I didn't know what
package it was in. Now, you can always google of course, but you can also use
system commands to find the package you need.
]]></description>
      <pubDate>Sat, 08 Nov 2008 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/search-for-a-package-with-aptfile.html</guid>
      <content:encoded><![CDATA[Recently I needed `ogg123` on an Ubuntu server to convert some media.
Naturally, I wanted to use `aptitude` to install it, but I didn't know what
package it was in. Now, you can always google of course, but you can also use
system commands to find the package you need.

<!--more-->

## Already Have the File? Use dpkg

If you already have the file, and just want to know which package it belongs
to, you can use `dpkg` like this:

```bash
$ dpkg -S $(which ogg123)
```

(replace `ogg123` with the command you are looking for)

## Don't have it yet? Use apt-file

I didn't have the package yet, so dpkg doesn't know about it either.

In this case you can install a nifty little program called `apt-file`, which
fetch & build a database of all apt packages and their contents, so you can
easily find what you need.

### Installing apt-file

Ok, let's install apt-file first.

```bash
$ aptitude install apt-file
```

`apt-file` works just like `apt`, so we need to update it's database before we
can use it.

```bash
$ apt-file update
```

### Search

Now we're ready to search for our package:

```bash
$ apt-file search ogg123
```

Which in my case returned:

```bash
irssi-scripts: /usr/share/irssi/scripts/ogg123.pl
python-pyvorbis: /usr/share/doc/python-pyvorbis/examples/ogg123.py
vorbis-tools: /usr/bin/ogg123
vorbis-tools: /usr/share/doc/vorbis-tools/examples/ogg123rc-example
vorbis-tools: /usr/share/man/man1/ogg123.1.gz
```

Very well, `vorbis-tools` it is!

```bash
$ aptitude install vorbis-tools
```

You've just got to love the apt.
]]></content:encoded>
      <dc:date>2008-11-08T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>org.eclipse.emf.ecore.util.EcoreEMap</title>
      <link>https://kvz.io/orgeclipseemfecoreutilecoreemap.html</link>
      <description><![CDATA[One error that has bugged my Eclipse PDT for a long time, was
_org.eclipse.emf.ecore.util.EcoreEMap $DelegateEObjectContainmentEList. _A
vague error, not much to go on, not many hits on google either. Turned out it
had to do with the version of my Java Runtime Environment I was using.
]]></description>
      <pubDate>Thu, 16 Oct 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/orgeclipseemfecoreutilecoreemap.html</guid>
      <content:encoded><![CDATA[One error that has bugged my Eclipse PDT for a long time, was
\_org.eclipse.emf.ecore.util.EcoreEMap $DelegateEObjectContainmentEList. \_A
vague error, not much to go on, not many hits on google either. Turned out it
had to do with the version of my Java Runtime Environment I was using.

<!--more-->

## Current Java Version in Use

You can check what Java Runtime version Ubuntu is using by entering:

```bash
$ java -version
```

In my case, it resulted in the following output

```bash
java version "1.5.0"
gij (GNU libgcj) version 4.2.3 (Ubuntu 4.2.3-2ubuntu6)

Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
```

PURPOSE.

Even though I have the original Sun Runtime installed:

```bash
$ dpkg -l |grep jre
ii  sun-java6-jre    6-07-3ubuntu2    Sun Java(TM) Runtime Environment
```

(JRE) 6

It was not being used. It was before. I don't know what changed it.

## Available Java Versions on System

To change it back, I used

```bash
$ sudo update-java-alternatives -l
```

to show me what candidates were available:

```bash
java-6-sun 63 /usr/lib/jvm/java-6-sun
java-gcj 1042 /usr/lib/jvm/java-gcj
```

## Change Java Version

I really want to change it back to Sun's version, so I run

```bash
$ sudo update-alternatives --config java
```

Which will show me

```bash
There are 4 alternatives which provide `java'.
  Selection    Alternative
-----------------------------------------------
          1    /usr/bin/gij-4.2
          2    /usr/bin/gij-4.1
*+        3    /usr/lib/jvm/java-gcj/jre/bin/java
          4    /usr/lib/jvm/java-6-sun/jre/bin/java

Press enter to keep the default[*], or type selection number:
```

So I press **4**

And this gives me:

```bash
Using '/usr/lib/jvm/java-6-sun/jre/bin/java' to provide 'java'.
```

## Confirm Java Version

To confirm it is really used, I ask Java it's version again:

```bash
$ java -version
```

Telling me:

```bash
java version "1.6.0_07"
Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
Java HotSpot(TM) Server VM (build 10.0-b23, mixed mode)
```

Alright, all did go well. When I run Eclipse PDT again, all of my errors are
gone.

Thank you [Ubuntu](https://help.ubuntu.com/community/Java)
]]></content:encoded>
      <dc:date>2008-10-16T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Rescuing my Messed Up Eclipse</title>
      <link>https://kvz.io/rescuing-my-messed-up-eclipse.html</link>
      <description><![CDATA[Hi folks. As you may or may not know, I have a
love/hate relationship with my IDE: Eclipse PDT.
For times and times we get along well. But once every
while it gets messed up, and it's a pain to straighten it out again. Or at
least, it was.
]]></description>
      <pubDate>Tue, 14 Oct 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/rescuing-my-messed-up-eclipse.html</guid>
      <content:encoded><![CDATA[Hi folks. As you may or may not know, I have a
[love/hate relationship with my IDE: Eclipse PDT](/blog/2008/04/11/my-new-ide-eclipse-pdt/).
For times and times we get along well. But once every
while it gets messed up, and it's a pain to straighten it out again. Or at
least, it was.

<!--more-->

## Update Gone Bad

After today's update I kept getting errors like:

```bash
Subversive SVN Integration for the Mylyn Project (Optional) (Incubation)

(0.7.4.I20081001-1900) requires feature "org.eclipse.mylyn_feature (3.0.0)", or  compatible.
JST Web Core (2.0.301.v200806230800-7Q7AE7LEHhHeh0hQ7qz0VBC) requires plug-in  "org.eclipse.emf.ecore.xmi (2.3.0)", or equivalent.

java.lang.NullPointerException_
```

Rendering Eclipse useless. Forcing me to spend hours of manual dependency
resolving in the minefield they call plugin manager (*or do they?*). The
random crashes and automagic-selection of mirrors that give 404 errors, don't
make it any less frustrating either.

With all the user-friendliness around us (Apple, Ubuntu, web 2.0) nowadays, I
just can't get my head around this terrible mess.

Don't release updates when they have a bigger chance of breaking than fixing
things.

## Revert to Previous

And just when I reach the point where I'm starting to forget all of the good
times me and my IDE have had, a little link finds me: "Revert to Previous".

![eclipsebrok01.png](/assets/images/posts/eclipsebrok01.png "eclipsebrok01.png")

OK.. This just might work. If I can go back in time, I could undo my updates,
restoring Eclipse to a working, thus, awesome state.

## Time Machine

Well let's see...

![eclipsebrok02.png](/assets/images/posts/eclipsebrok02.png "eclipsebrok02.png")

hm. So far so good.

And so I select a point in time that I want to revert to, and just click
Finish.

Eclipse asks my permission to restart, let's me save some unfinished work, and
voila. I'm back in business.

## Bumpy Ride

Well that was one bumpy ride. But it will hopefully be my last, thanks to this
feature.

Saved by the bell, Eclipse.. I had almost given up you.
]]></content:encoded>
      <dc:date>2008-10-14T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>How Virtualization Will Improve Your Code</title>
      <link>https://kvz.io/how-virtualization-will-improve-your-code.html</link>
      <description><![CDATA[Good testing will result in better code. If you have to wait endlessly for on
SVN commits, uploads or compile steps, you will simply produce less inventive
code. This has to do with: patience, creativity flow, will, and of course
time. Constantly being interrupted breaks concentration. If there's one thing
I've really learned, it's invest in a good testing environment. Rapid
review of code results will pay off (I promise).
]]></description>
      <pubDate>Mon, 06 Oct 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/how-virtualization-will-improve-your-code.html</guid>
      <content:encoded><![CDATA[Good testing will result in better code. If you have to wait endlessly for on
SVN commits, uploads or compile steps, you will simply produce less inventive
code. This has to do with: patience, creativity flow, will, and of course
time. Constantly being interrupted breaks concentration. If there's one thing
I've really learned, it's **invest in a good testing environment**. Rapid
review of code results will pay off (I promise).

<!--more-->

So it's OK to spend some time on [learning a good IDE](/blog/2008/12/02/my-new-ide-netbeans/), and another trick
to improve the speed & quality of development, is to virtualize your
production platform to your local workstation. The fake-production server
(virtual machine) will mount your code directory directly as it's webroot, and
so it can serve your IDE-work instantly.

## Basic Idea

Check out this picture

![phpjs.png](/assets/images/posts/phpjs.png "phpjs.png")

You can:

- save your code in your IDE
- switch to your 'fake production server', running as a virtual machine
- instantly view results, as if it was served by your real server

Some advantages

- no delays
- no touching of production environment
- clone machine and expirment with different PHP / module versions
- shut down virtual machine when done programming

## About the Article

This setup took me a day. Writing the article added some time, but should
enable you to do it in a couple of hours.

I'm setting this up on an Ubuntu workstation. But since the actual work is
done inside a virtual machine, this should work on every OS (Windows, Mac,
FreeBSD, everything). For support on setting up VirtualBox on those operating
systems, just checkout the VirtualBox site & forum.

System administration skills are required. You could mess up your system. I
warned you.

## Ubuntu on Ubuntu?

You may have noticed I already run Ubuntu. So why not directly run apache on
top of that? Why install another Virtual Ubuntu?. Simple. Because I want to:

- Experiment with multiple virtual development machines, maybe running
  different versions
- Not pollute my workstation with server software
- To be able to turn all server software off, by shutting down the virtual
  machine, freeing up resources for, let's say, streaming HD content to my PS3.
- Share my virtual development machine image with colleagues (once it's
  perfect). So they all can have the joy of directly seeing the results of their
  code, without being able to cause any damage to existing systems.

Ok, let's get on with it!

## Install VirtualBox

In the process of writing this article, Sun has acquired VirtualBox and
improved the installation procedure. Hence, checkout:
[dlc.sun.com/virtualbox/vboxdownload.html](https://dlc.sun.com/virtualbox/vboxdownload.html)

## Download Operating System for Virtual Machine

My production server uses Ubuntu (Server edition) as an OS. So my virtual
development server should be Ubuntu as well. I'm going for the Desktop edition
however. Further on, you'll learn that installing the desktop version has some
advantages in our case, and Apache & others won't care about the fact that we
have a different kernel and some extra GUI packages installed.

So download [Ubuntu Desktop](https://www.ubuntu.com/getubuntu/download) like this:

```bash
$ mkdir -p ~/ISO
$ curl https://ftp.snt.utwente.nl/pub/linux/ubuntu-releases/hardy/ubuntu-8.04-desktop-i386.iso -ko ~/ISO/ubuntu-8.04-desktop-i386.iso
```

Or however you see fit.

## Creating a New Virtual Machine

First of all, start VirtualBox OSE. It's under Applications->System Tools

Click New, Next

\->VM Name and OS Type

Come up with a name for your machine

Choose Linux 2.6

\->Base Memory Size

Choose 512MB (depending on your host's configuration)

\->Virtual Hard Disk

Click New..., Next

\->Virtual Disk Image Type

Choose Dynamically expanding image, Next

\->Virtual Disk Location and Size

Come up with a name for your disk

Choose 12GB (depending on your host's configuration)

Click Finish

\->Virtual Hard Disk

Click Next, Finish

## Configure the New Virtual Machine

Click on your machine, Settings

\->In General, Advanced

Enable VT-x/AMD-V if your host supports it.

\->In CD/DVD-ROM

Mount CD/DVD Drive

ISO Image File, Select

\->Virtual Disk Manager

Click Add

Select ~/ISO/ubuntu-8.04-desktop-i386.iso

\->In Shared Folders

Click Add, Select

Select ~/workspace/project/

Click OK

Start the new Virtual Machine

The Ubuntu installation will show up.

## Install Operating System

![booting.png](/assets/images/posts/booting.png "booting.png")

Install Ubuntu like you would normally, this is a next, next, next alike
procedure.

Reboot eventually, and Login

## Drivers on the New Virtual Machine

Click Devices, Install Guest Additions, Yes, Download, Mount

Open a terminal, type

```bash
$ sudo /media/cdrom/VBOXLinuxAdditions.run
$ sudo reboot
```

## Upgrade Operating System

First let's unlock all of Ubuntu's wonderful extra repositories.

Open a terminal, copy & paste:

```bash
$ sudo sed -i "/^# deb.*multiverse/ s/^# //" /etc/apt/sources.list
$ sudo sed -i "/^# deb.*universe/ s/^# //" /etc/apt/sources.list
```

And now let's upgrade, type:

```bash
$ sudo aptitude -y update \
 && sudo aptitude -y dist-upgrade \
 && sudo aptitude -y install subversion msttcorefonts \
 && sudo reboot
```

Wait a bit for Ubuntu to get up to speed. It's possible that Ubuntu finds a
new kernel. In such a case, you may have to reinstall the VirtualBox Drivers
(Devices->Install Guest Additions).

## Make It a Server

We should probably stick as close as possible to your production server's
configuration. I'm going to assume it's a LAMP server, so let's set that up
first.

Open a terminal, type

```bash
$ sudo aptitude -y phpmyadmin mysql-server
$ sudo a2enmod rewrite
```

*phpmyadmin* has a lot of dependencies so that it automatically installs
apache2, php5, etc.

You probably want mod\_rewrite enabled as well.

Does it work?

Well let's open up a browser and find out. Goto <https://localhost>.

**Cool**, apache is serving up a page nicely. You can also check out
<https://localhost/phpmyadmin/>

To confirm that MySQL & PHP are working as well.

![phpmyadmin.png](/assets/images/posts/phpmyadmin.png "phpmyadmin.png")

But this is all very default. Not quite similar to your production server
environment yet.

## Make It Your Server

These links will help you copy data from your production server to your
virtualized server:

- [Restore packages using dselect-upgrade](/blog/2007/08/03/restore-packages-using-dselectupgrade/) (for your APT & PEAR packages)
- [Synchronize files with rsync](/blog/2007/08/16/synchronize-files-with-rsync/) (maybe copy config files or content)
- [Tranfer all MySQL databases to another server](/blog/2007/11/22/tranfer-all-mysql-databases-to-another-server/) (maybe copy your
  database)

But know that there is no generic way to make it your server. Only you know
what config files are important for your server. So let me show you how I made
it my server, then.

## Make It My Server

### Going Root

Because it's me, I'm done sudoing. It's a virtual server and the risks are
minimal... I'm going root.

```bash
$ sudo passwd # (enter password 3 times)
$ su -
```

### Get Code

I need the project's code. But - and this is very important - I don't want a
separate checkout from SVN. I want the fresh code that's in my workspace
folder on my workstation. This way, I can just save my code, and my virtual
server can instantly show me the changes. Without having to commit, and
without harming anything else then my local working copy.

We have already shared our project's working copy in step "Configure the new
Virtual Machine". In my case it's stored on my workstation in
`~/workspace/project/`, and so the share is called 'project'.

To make it available in the 'production' directory of the virtual machine, I
type the following:

```bash
$ mkdir -p /var/www/project
$ mount -t vboxsf -o uid=33,gid=33 project /var/www/project
$ ls -al /var/www/project
```

![mount.png](/assets/images/posts/mount.png "mount.png")

Because /var/www/project is the same directory that the production server
uses, you shouldn't get any `'require_once failed to open stream: No such file or directory'` errors.

### Get Environment

Of course I need to tell Apache to serve the code in /var/www/project when it
sees a request for your.projects.domain.com. That's done with VHosts, and I'm
just taking the production server's VHost as an example (the closer the
resemblance, the better, right?).

```bash
$ scp my.production.server.com:/etc/apache2/sites-available/project /etc/apache2/sites-available/
```

```bash
$ a2ensite project
$ /etc/init.d/apache2 restart
```

#### Fixing Apache Errors

What does apache say? Now is the time to resolve any missing paths, config
files, or modules you may encounter. This is where you have to do some actual
work!

In my case, I needed to fix the following:

- Add 443 to `ports.conf`
- Enable SSL with `a2enmod ssl`
- Generate SSL certificates
- Change some references from `/etc/apache2/ssl/apache.pem` to
  `/etc/apache2/ssl` (had to do with switching from Debian to Ubuntu)
- Change all external IP references to 127.0.0.1 in my VHost
- Install memcache module, `aptitude install php5-memcache`
- Include memcache module add `extension=memcache.so` to `php.ini`
- Allow requests from Virtual Machine, by adding an `Allow from 127.0.0.1`
  directive to my `.htaccess`

OK that fixed all of my initial errors.

### Get Database?

![mysql.jpg](/assets/images/posts/mysql.jpg "mysql.jpg")

You could go two ways:

- Allow your outgoing IP access to your real production server and work
  with live data (easy, but insecure)
- Copy your production database to your virtual development server

In my case, I chose option 1: to let the Virtual instance connect to the real
production database, so I could test with live data, and didn't have to copy
the entire database to my instance.

Whether this is wise or not, depends on your situation & skill.

But if you want a real & separate test environment, you might also want a copy
of your database.

- Transfer your live data to your virtual machine's MySQL server
- Point the database hostname to 127.0.0.1, by including a line in your
  `/etc/hosts` file, like: `127.0.0.1 database.your.projects.domain.com`

Now, all connections from PHP made to *database.your.projects.domain.com*,
will end up pointed to your localhost, where a local MySQL instance is
running.

*The same goes for memcached servers*, and all other types of servers for that
matter.

## Testing Your Project

Now. We could set up bridged network interfaces (tap) that allows the virtual
machine to claim it's own IP. This way you can use your workstation's browser
to surf to that IP and test your code. We'd have to change your workstation's
`/etc/hosts` file to have your.projects.domain.com point to the IP of the
virtual machine (let's say 10.0.0.2)

But setting up tap interfaces is a drag, and changing your hosts file every
time is just as well.

So why not test within the virtual machine? **It's a desktop anyway :)** And
you only have to change it's hosts file once. You can test your workspace's
code within your virtual instance, and you keep your workstation clean for
showing the real thing. This means you don't have to modify your own
workstation except for installing VirtualBox. This could also open up
opportunities to share your virtual machines image with colleagues. An entire
development & testing environment in 1 file. **That's quite awesome.**

### Change Hosts File

In order for everything to work properly, we need to point
your.projects.domain.com to 127.0.0.1, this way our local apache is contacted,
even though we request your.projects.domain.com. And requesting the domain
name is a must, to ensure you internal code is treated in the same way, and
that apache knows which VHost to serve.

```bash
$ $EDITOR /etc/hosts
```

And add your.projects.domain.com after 127.0.0.1 localhost, make the head look
like this:

```bash
127.0.0.1       localhost your.projects.domain.com
127.0.1.1       hulk
```

### Browse Away!

So?! What are you waiting for? Fire up firefox and enter
<https://your.projects.domain.com>. What do you see?

## Fixing Project Errors

Well I saw some errors. Some includes could not be found.

### File Errors

When I commit to our production server, some files are copied to the document
root that are not in my working copy. This is done by a deploy script tied to
the post-commit hook.

To fix this I added a line to my virtual host:

```bash
php_admin_value include_path .:/usr/share/php:/usr/share/pear:/var/www/project/_common
```

So it would also look in the `_common` folder. Other solutions may be:

- Change some code in your project, making it aware that it's running in
  development mode when it's hosted on 127.0.0.1, thus pulling files from some
  other location.
- Make the dependencies available on your virtual machine
- Let rsync run in daemon mode and pull files on changes
- Creating a local deploy script (could be a lot faster than te remote one,
  but try to stay away from 'compile' steps, we're doing this for agility &
  speed!)

## Further Reading

### Make It Stick

When you turn your new virtual development machine off & on, you don't want to
have to reconfigure things. Almost everything is saved in config files so you
don't have too, except for the mount of the shared project directory. That was
just a command we executed.

But we can easily save it in a config file, called the `/etc/fstab`. So why
not open it up with your favourite text editor, and add the following line:

```bash
project    /var/www/project vboxsf uid=33,gid=33 0 0
```

This will make the mount persistent. If it doesn't, add a `mount -a` to your
`/etc/rc.local`

### Troubleshooting Screen Resolution

Since we're testing inside the virtual machine, a resolution higer than
800x600 would be nice. If this works out of the box: good for you.  I had
problems setting it higher using System->Preferences->Screen. To fix, I
changed the my */etc/X11/xorg.conf* to:

```bash
# xorg.conf (X.Org X Window System server configuration file)
#
# This file was generated by dexconf, the Debian X Configuration tool,
```

using

```bash
# Values From the Debconf Database.
#
# Edit this file with caution, and see the xorg.conf manual page.
# (Type "man xorg.conf" at the shell prompt.)
#
# This File Is Automatically Updated on Xserver-Xorg Package Upgrades
```

*only*

```bash
# if it has not been modified since the last upgrade of the xserver-xorg
# package.
#
# If you have edited this file but would like it to be automatically
```

updated

```bash
# again, run the following command:
#   sudo dpkg-reconfigure -phigh xserver-xorg
Section "InputDevice"
    Identifier      "Generic Keyboard"
    Driver          "kbd"
    Option          "XkbRules"      "xorg"
    Option          "XkbModel"      "pc105"
    Option          "XkbLayout"     "us"
EndSection

Section "InputDevice"
    Identifier      "Configured Mouse"
    Driver      "vboxmouse"
    Option      "CorePointer"
EndSection
Section "Device"
    Identifier      "Configured Video Card"
    Driver "vboxvideo"
EndSection

Section "Monitor"
    Identifier      "Configured Monitor"
EndSection
Section "Screen"
    Identifier "Default Screen"
    Device "Configured Video Card"
    Monitor "Configured Monitor"
    DefaultDepth 24
    SubSection "Display"
            Modes "1152x768" "1024x768" "800x600"
    EndSubSection
EndSection

Section "ServerLayout"
    Identifier      "Default Layout"
    Screen          "Default Screen"
EndSection
```

### Other Virtual Machine Ware

I chose VirtualBox. If you want to know why, here's my very subjective
[comparison 'chart' on Virtual Machine software](/blog/2008/09/13/virtualization-compared/)

## Performance

More cores help ; )

## Improve This Article

This was quite a long post, so if you spot errors or know better ways, help me
improve this article by leaving a comment.
]]></content:encoded>
      <dc:date>2008-10-06T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>PEAR Coding Standards Changed!</title>
      <link>https://kvz.io/pear-coding-standards-changed.html</link>
      <description><![CDATA[In another article I've told you about how I would like to see one rule
removed from the PEAR Coding Standards. This rule would allow developers
a bit more flexibility, while staying true to the convention.
]]></description>
      <pubDate>Wed, 24 Sep 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/pear-coding-standards-changed.html</guid>
      <content:encoded><![CDATA[In [another article](/blog/2008/08/22/pear-coding-standards-change/) I've told you about how I would like to see one rule
removed from the [PEAR Coding Standards](https://pear.php.net/manual/en/standards.php). This rule would allow developers
a bit more flexibility, while staying true to the convention.

<!--more-->

At first the discussion was between the PEAR developers on the mailing list,
but as this is somewhat a matter of taste, the opinions were widespread. The
PEAR group adopted the issue into their own meeting, silencing the discussion
on the mailing list.

And now it seems they have decided to indeed omit a specific declaration on
prefixing private or protected members with an underscore from the Coding
Standards.

**Effectively** leaving every PEAR developer the freedom to prefix their
private & protected members however they want (or not).

## More info

If you would like to know more about the background of this 'story'.

### Proposal

```bash
Hi Joshua,

First of all, thanks for your time.

On Fri, Aug 22, 2008 at 12:59 AM, Joshua Eichorn <josh at bluga dot net>
```

wrote:

```bash
> If you can email me what you think it should look like that would be
> helpful.  But i will bring it up at the meeting (already added it to the
> agenda)

OK, I'll try to be as concise as possible, though my English may fail
me from time to time, and I should probably be sleeping already ; )

To clear one potential misunderstanding: My proposal is not about
forcing people to prefix every protected method with an underscore.
It's about allowing them to do so.

Private methods are becoming a minority. The need for them decreases
as classes often need to be fully extensible.

Though protected methods can indeed be redeclared as public (one could
even argue that protected methods explicitly meant/allowed to be
redeclared should not have an underscore, other should); these cases
are rare.  And in practice, protected methods are often replacing the
role that private methods used to have.

Some developers (like me) find it helpful to distinct 'inner methods'
and public interfacing methods visually, by prefixing them with an
underscore:

_load()
_stopBreathing()
_aim()
_fire()
_resumeBreathing()
shoot()

Distinguish functions that don't concern the outer world by
alphabetically grouping them. This comes in handy with the drop-down
lists many IDEs provide.

But faced with a movement where private methods are diminishing and
protected methods are taking over, this leaves us with two types of
methods: protected & public. PEAR coding standards do not allow us to
prefix protected methods with an underscore. Leaving us with:

load()
stopBreathing()
aim()
fire()
resumeBreathing()
shoot()

Which in some IDEs & minds is very OK. It\'s mostly an issue of
habit/taste, referring to your bike shedding project (red indeed is a
fine color).

Having thrown overboard lots of my old habits, incorporating PEAR's, I
definitely understand the need for convention. But since quite some
developers use this concept as a profound tool add order to chaos,
such a small thing could make for a big difference in coding
experience. It would at least for me at the moment.

So I'm pleading here for PEAR CS to allow developers to choose their
own protected method names (whether prefixed with underscores, or
not).

Hope this wasn't too long!

--
Met vriendelijke groet / Kind regards,

Kevin van Zonneveld
[https://kevin.vanzonneveld.net][3]
```

### Links

- [PEAR Group's meeting minutes](https://wiki.pear.php.net/index.php/MeetingMinutes20080824#Underscore-p)

A short note confirming their decision

- [Discussion on the mailing list](https://www.nabble.com/PEAR-Coding-Standards-question-)

What started it

## Finally

Well, just like to say that I'm very happy to see that in PEAR - as an
organization - it's actually possible to make these things hapen, even though
you only have [one PEAR contribution](https://pear.php.net/package/System-Daemon) to back you up.

So thank you PEAR, and thank you PEAR Group.
]]></content:encoded>
      <dc:date>2008-09-24T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Virtualization Compared</title>
      <link>https://kvz.io/virtualization-compared.html</link>
      <description><![CDATA[Recently I've been experimenting with Virtual machines for my development
environment. The goal was to create a Virtual Machine that resembles our main
production server, and have that Virtual Machine mount my workspace project
directory as it's DocumenRoot. This way, my code could be served &amp; tested
after every save in my IDE. So no more building / committing delays. And all I
could mess up was a Virtual Machine.
]]></description>
      <pubDate>Sat, 13 Sep 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/virtualization-compared.html</guid>
      <content:encoded><![CDATA[Recently I've been experimenting with Virtual machines for my development
environment. The goal was to create a Virtual Machine that resembles our main
production server, and have that Virtual Machine mount my workspace project
directory as it's DocumenRoot. This way, my code could be served & tested
after every save in my IDE. So no more building / committing delays. And all I
could mess up was a Virtual Machine.

<!--more-->

I didn't know what software to start with and just tried the bunch. Here's my
**ever so subjective** comparison 'chart' on Virtual Machine software.

## Choices

#### VMWare

- Best user interface around
- Has better network support (allows guests their own IPs easily)
- Will eat your harddrives alive
- Allround slow. Running two instances will render your workstation useless.
- Costs money (workstation edition, which is what I needed)

#### KVM

- Is supposed to be quite awesome (especially beats VirtualBox Networking)
- Fairly new and still needs some work (especially on the interface)
- Could not find good documentation yet
- Only Windows & Linux guests supported

#### Qemu

- Fast
- Supports many operating systems
- Scares me
- Documentation was offline
- Development seems to have dropped. Last changelog was from January 2008
- A lot of 'hand work', could not find a good GUI

#### Xen

- Good for production use in server environments
- Poor workstation interface

#### VirtualBox

- Easy user interface
- Good performance
- Good integration with OS
- Open source 'VMWare'
- Available through package management like apt
- Poor network support (briding may not be required, but called for
  extensive tweaking of my workstation's network interfaces file)

## And the winner is...

VirtualBox! For my workstation environment VirtualBox it delivers the best
tradeoff between costs, performance and ease of use.

I will be looking forward to people sharing their experiences, and concerns
about the lack of scientific substantiation of this article ; )

## Keep in mind

That virtualization is a real hype now. I've been following VirtualBox for 3
months, and Sun Microsystems for instance is making solid progress with the
newly aquired VirtualBox. Other big players are investing serious resources as
well, and if KVM can get the docs & GUI right, they're bound to play a big
role as well, since it's integrated with the Linux kernel.

In short: my findings may be outdated shortly after posting. Ah well.. here it
goes anyway.
]]></content:encoded>
      <dc:date>2008-09-13T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>PHP Recursive str_replace: replaceTree</title>
      <link>https://kvz.io/php-recursive-str-replace-replacetree.html</link>
      <description><![CDATA[Working with trees
]]></description>
      <pubDate>Fri, 05 Sep 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/php-recursive-str-replace-replacetree.html</guid>
      <content:encoded><![CDATA[## Working with trees

When working with tree data structures you often need to craft them in
different ways. PHP offers a lot of functions to change the shape of arrays,
but often they only go 1 level deep. Trees can count an almost infinite number
of levels. Hence we need recursive replacements for our beloved array & string
functions.

<!--more-->

## replaceTree

replaceTree is the tree version of str\_replace. It will recursively replace
through an array of strings.

```php
<?php
/**
 * Recursive alternative to str_replace that supports replacing keys as well
 *
 * The following code block can be utilized by PEAR's Testing_DocTest
 * <code>
 * // Input //
 * $settings = array(
 *	 "Credits" => "@appname@ created by @author@",
 *	 "Description" => "@appname@ can parse logfiles and store then in mysql",
 *	 "@author@_mail" => "kevin@vanzonneveld.net"
 * );
 * $mapping = array(
 *	 "@author@" => "kevin",
 *	 "@appname@" => "logchopper"
 * );
 *
 * // Execute //
 * $settings = replaceTree(
 *	 array_keys($mapping), array_values($mapping), $settings, true
 * );
 *
 * // Show //
 * print_r($settings);
 *
 * // expects:
 * // Array
 * // (
 * //	 [Credits] => logchopper created by kevin
 * //	 [Description] => logchopper can parse logfiles and store then in mysql
 * //	 [kevin_mail] => kevin@vanzonneveld.net
 * // )
 * </code>
 *
 * @param string  $search
 * @param string  $replace
 * @param array   $array
 * @param boolean $keys_too
 *
 * @return array
 */
function replaceTree($search="", $replace="", $array=false, $keys_too=false)
{
	if (!is_array($array)) {
		// Regular replace
		return str_replace($search, $replace, $array);
	}

	$newArr = array();
	foreach ($array as $k=>$v) {
		// Replace keys as well?
		$add_key = $k;
		if ($keys_too) {
			$add_key = str_replace($search, $replace, $k);
		}

		// Recurse
		$newArr[$add_key] = replaceTree($search, $replace, $v, $keys_too);
	}
	return $newArr;
}
?>
```
]]></content:encoded>
      <dc:date>2008-09-05T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>PHP Recursive ksort: ksortTree</title>
      <link>https://kvz.io/php-recursive-ksort-ksorttree.html</link>
      <description><![CDATA[Working With Trees
]]></description>
      <pubDate>Fri, 05 Sep 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/php-recursive-ksort-ksorttree.html</guid>
      <content:encoded><![CDATA[## Working With Trees

When working with tree data structures you often need to craft them in
different ways. PHP offers a lot of functions to change the shape of arrays,
but often they only go 1 level deep. Trees can count an almost infinite number
of levels. Hence we need recursive replacements for our beloved array
functions.

<!--more-->

## ksortTree

ksortTree is the tree version of ksort. It will alphabetically reorder a tree
based on it's keys.

```php
<?php
/**
 * Recusive alternative to ksort
 *
 * The following code block can be utilized by PEAR's Testing_DocTest
 * <code>
 * // Input //
 * $array = array(
 *	 "c" => array(
 *		 "d" => 4,
 *		 "a" => 1,
 *		 "b" => 2,
 *		 "c" => 3,
 *		 "e" => 5
 *	 ),
 *	 "a" => array(
 *		 "d" => 4,
 *		 "b" => 2,
 *		 "a" => 1,
 *		 "e" => 5,
 *		 "c" => 3
 *	 ),
 *	 "b" => array(
 *		 "d" => 4,
 *		 "b" => 2,
 *		 "c" => 3,
 *		 "a" => 1
 *	 )
 * );
 *
 * // Execute //
 * ksortTree($array);
 *
 * // Show //
 * print_r($array);
 *
 * // expects:
 * // Array
 * // (
 * //	 [a] => Array
 * //		 (
 * //			 [a] => 1
 * //			 [b] => 2
 * //			 [c] => 3
 * //			 [d] => 4
 * //			 [e] => 5
 * //		 )
 * //
 * //	 [b] => Array
 * //		 (
 * //			 [a] => 1
 * //			 [b] => 2
 * //			 [c] => 3
 * //			 [d] => 4
 * //		 )
 * //
 * //	 [c] => Array
 * //		 (
 * //			 [a] => 1
 * //			 [b] => 2
 * //			 [c] => 3
 * //			 [d] => 4
 * //			 [e] => 5
 * //		 )
 * //
 * // )
 * </code>
 *
 * @author	Kevin van Zonneveld <kevin@vanzonneveld.net>
 * @copyright 2008 Kevin van Zonneveld (https://kevin.vanzonneveld.net)
 * @license   https://www.opensource.org/licenses/bsd-license.php New BSD Licence
 * @version   SVN: Release: $Id: ksortTree.inc.php 223 2009-01-25 13:35:12Z kevin $
 * @link	  https://kevin.vanzonneveld.net/
 *
 * @param array $array
 */
function ksortTree( &$array )
{
	if (!is_array($array)) {
		return false;
	}

	ksort($array);
	foreach ($array as $k=>$v) {
		ksortTree($array[$k]);
	}
	return true;
}
?>
```
]]></content:encoded>
      <dc:date>2008-09-05T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>PEAR Coding Standards Change?</title>
      <link>https://kvz.io/pear-coding-standards-change.html</link>
      <description><![CDATA[Since a couple of months now, I've been involved with PEAR as a
contributor. Contributing to PEAR means adhering to the
PEAR Coding Standards. Their standards have actually been thought over, and using them
for projects (also outside of PEAR), leads to consistency, and makes it easier
for many developers to understand each other's code.
]]></description>
      <pubDate>Fri, 22 Aug 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/pear-coding-standards-change.html</guid>
      <content:encoded><![CDATA[Since a couple of months now, I've been involved with [PEAR](https://pear.php.net) as a
[contributor](https://pear.php.net/package/System-Daemon). Contributing to PEAR means adhering to the
[PEAR Coding Standards](https://pear.php.net/manual/en/standards.php). Their standards have actually been thought over, and using them
for projects (also outside of PEAR), leads to consistency, and makes it easier
for many developers to understand each other's code.

<!--more-->

Code can be scanned and checked for conformity using the [PHP CodeSniffer](https://pear.php.net/package/PHP-CodeSniffer)
package.

It took me a while to get rid of my old coding habits & pick the standards up
to be second nature, but I'm very happy with the results now. Except for one
rule. It has bugged me to the point that I've decided to ask the [PEAR
Group](https://pear.php.net/group/) to reconsider it.

It's about denying developers to prefix their protected methods with an
underscore.

I'll try to be as concise as possible, though my English may fail me from time
to time, and I should probably be sleeping already ; )

## Prefixing Methods

Some developers (like me) find it helpful to distinct 'inner methods' and
public interfacing methods visually, by prefixing them with an underscore:

```php
<?php
// Please just let the outside world: shoot,
// and let the class worry about things like breathing properly
_load() {}
_stopBreathing() {}
_aim() {}
_fire() {}
_resumeBreathing() {}
shoot() {}
?>
```

Alphabetically grouping functions that don't concern the outer world, comes in
handy with the drop-down lists many IDEs provide. Especially with
large/unknown/forgotten projects, it can provide quick insight into the inner
workings of a class.

To clear one potential misunderstanding: My proposal is not about forcing
people to prefix every protected method with an underscore. It's about
allowing them to do so.

## Privates Are Dead. Long Live the Protected!

Private methods are becoming a minority. The need for them decreases as
classes often need to be fully extensible.

Faced with a movement where private methods are diminishing and protected
methods are taking over, this leaves us mainly with two types of methods:
**protected** & **public**.

PEAR coding standards do not allow us to prefix protected methods with an
underscore. Resulting in:

```php
<?php
// Please just let the outside world: shoot,
// and let the class worry about things like breathing properly
load() {}
stopBreathing() {}
aim() {}
fire() {}
resumeBreathing() {}
shoot() {}
?>
```

Which in some IDEs & minds is very OK. It's mostly an issue of habit/taste.
Having thrown overboard lots of my old habits, incorporating PEAR's, I
definitely understand the need for convention. But since quite some developers
use this concept as a profound tool add order to chaos, such a small thing
could make for a big difference in coding experience. It would at least for me
at the moment.

## Can't protected methods be redeclared as public?

Yes, and this a compelling argument against my case, because once you
redeclare a method as public, it still is prefixed with an underscore!

Take into consideration that private methods currently **must** be prefixed
with underscores. Should you follow through this logic, one could argue that
protected methods explicitly meant/allowed to be redeclared as public: musn't
have an underscore, but all others should! (let's make it clear: I am not
arguing this!).

But you see there's a need for flexibility when it comes to protected methods.

In reality. Should need a protected method to be interfacing with the outside
world, you have to take orthogonality into account, and you're probably better
off wrapping the protected method inside a public method anway.

Still, these cases that I speak of are rare. In 99% of the cases, you will
find that protected methods are just replacing the role that private methods
used to have.

## Concluding

So I'm pleading here for PEAR CS to allow developers to choose their own
private & protected method names (whether prefixed with underscores, or not).

The discussion can be viewed here:

[www.nabble.com/PEAR-Coding-Standards-question-tt19054118.html#a19054118](https://www.nabble.com/PEAR-Coding-Standards-question)
]]></content:encoded>
      <dc:date>2008-08-22T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Enhance PHP Session Management</title>
      <link>https://kvz.io/enhance-php-session-management.html</link>
      <description><![CDATA[In PHP, sessions can keep track of authenticated in users. They are an
essential building block in today's websites with big communities and a lot of
user activity. Without sessions, everyone would be an anonymous visitor.
]]></description>
      <pubDate>Sun, 22 Jun 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/enhance-php-session-management.html</guid>
      <content:encoded><![CDATA[In PHP, sessions can keep track of authenticated in users. They are an
essential building block in today's websites with big communities and a lot of
user activity. Without sessions, everyone would be an anonymous visitor.

<!--more-->

In system terms, PHP sessions are little files, stored on the server's disk.
But on high traffic sites, the disk I/O involved, and not being able to share
sessions between multiple webservers make this default system far from ideal.
This is how to enhance PHP session management in terms of performance and
shareability.

## Update

I've turned around on this subject and recommend reading [Revisiting Faster PHP Sessions](/blog/2011/04/29/faster-php-sessions/) instead.

## Session Sharing in Web Clusters

If you have multiple webservers all serving the same site, sessions should be
shared among those servers, and not reside on each server's individual disk.
Because once a user gets load-balanced to a different server, the session
cannot be found, effectively logging the user out.

A common way around this is to use custom session handlers. Writing a class
that overrules default behavior and stores sessions in a MySQL database.

## Sessions in Database

All webservers connect to the same database and so, as soon as www01 registers
a session (insert in a sessions table), www02 can read it. All servers can now
see all sessions: problem solved?

Yes, and no. This sure is functional and tackles the shareability issue. But
databases seem to be the biggest bottlenecks of web clusters these days. They
are the hardest to scale, and so in high traffic environments you don't want
to (ab)use them for session management if you don't have to. We have to tackle
the 'performance' issue.

### Database Memory

Memory is about 30 times faster than disk storage. So storing our sessions in
memory somehow, could deliver great performance.

#### MySQL Query Caching

One form of using database memory is the standard MySQL query caching. But
MySQL query caching isn't very effective because it invalidates all cache
related a table, if only one record in that table is changed.

Of course the session table is changed all the time, so the session cache is
purged all the time, rendering it quite useless for our purposes.

#### Heap Tables / Memory Tables

We're really closing in to our goal now. Storing the sessions in a heap/memory
table (a table that lives in your database server's RAM) speeds up things
greatly. Many demanding sites have opted for this solution.

In my eyes however, it's still not optimal. Because it still requires a lot of
additional queries that your database server(s) shouldn't necessarily have to
process.

One other possible solution is using Memcache. And you will find it's easier
to setup and has a smaller footprint than most alternatives. For one thing,
because you will not have to code custom session handler classes in PHP.
Memcache session support comes native.

## Memcache

Memcache is a little program originally written by Live Journal. It's quite
straight forward: It reserves some memory, opens a socket and just stays
there.

We can connect to the socket and store variables in it, and later retrieve
them later on. The storage of the variables is done in RAM. So it's lighting
fast ; )

Memcache is used for caching a lot things: function results, entire html
blocks, database query results. But now we're going to use it to store our
site's user sessions.

### Architecture

From system point of view, Memcache looks a lot like MySQL. You have a:

- **Server**

Where information is stored. Should be running at all times.

- **Client module**

Interface to save & get information from the server.

It's integrated in our programming language.

There is one important difference though. If the Memcache server is shut down,
the information in it is lost. So remember to use memcache as a cache only.
Don't store information in it, that can't be retrieved in some other way. For
sessions this is a risk I'm willing to take. Worst case scenario is that my
users will be logged out. If you cannot live with this, you could combine
database & memcache session handlers. Database will be the safe storage,
memcache will be in front of it for performance. If it crashes, you will only
lose performance, and not the data.

### Installing a Memcache Server

For session sharing, use a centralized server. If you only have one webserver,
it still makes sense to use Memcache from performance point of view. Just
limit it's maximum allowed memory size to 64MB (depending on your server &
wishes), and use the localhost (127.0.0.1) to connect to it.

If you don't have a Memcache server already, you can install it very easily
with package management. I use Ubuntu so in my case that would translate to:

```bash
$ aptitude install memcached
```

Adjust the settings in `/etc/memcached/memcached.conf`. In my case the
defaults were OK, I only increased the max allowed memory, and allowed more
hosts to connect to it.

Now let's spawn the Memcache Daemon (yes that's what the 'd' stands for):

```bash
$ /etc/init.d/memcached start
```

Done, we're ready to use it... But how?

### Installing a Memcache 'client'

You could just open a socket and go talk to Memcache, but that would
eventually cause headaches. So there is a standard PHP module we can use that
does a lot of work for us, and allows us to talk object oriented to it. This
works much like installing a MySQL module. If it's in your distro's package
management, good for your, let's:

```bash
$ aptitude install php5-memcache
```

If not, no problem. Make sure you have pecl available and:

```bash
$ pear install pecl/memcache
```

(I used 'pear' and not directly pecl to circumvent the [bug](https://aspn.activestate.com/ASPN/Mail/Message/pear-dev/3168978) that caused a
Fatal error: Allowed memory size of 8388608 bytes exhausted).

And please choose:

```bash
Enable memcache session handler support? [yes] : yes
```

You must enable PHP to use the `memcache.so` module we now have. So please add a

```bash
extension=memcache.so
```

to your `php.ini` (usually located at `/etc/php5/apache2/php.ini`)

Great, we have all the prerequisites to start Memcaching!

## Sessions in Memcache

PHP allows you to overrule the default session handler in two ways:

- [session\_set\_save\_handler()](https://nl3.php.net/manual/en/function.session-set-save-handler.php). By programming your own session
  handlers, allowing you to virtually use any type of storage, as long as you
  can read/write to it from PHP. [This example](https://nl3.php.net/manual/en/function.session-set-save-) uses a MySQL database. We
  could also use this method to connect to Memcache.
- [session.save\_handler](https://nl3.php.net/manual/en/session.configuration.php#ini.session). By specifying one of the default handlers in
  the php.ini file using the [session.save\_handler](https://nl3.php.net/manual/en/session.configuration.php#ini.session) & session.save\_path
  directives.

Option 1 allows greater flexibiliy. And it even allows you to create a
combined database/memcache mechanism. Resulting in a fallback-on-database in
case memcache goes offline and loses all of it's sessions (effictively logging
out all users).

Option 2 is very easy to implement, doesn't require changing your existing
code, and is the one I'm going to show you today.

### session.save\_handler

Assuming that you have one webserver, and installed the Memache server on that
same machine, the hostname will be 127.0.0.1. If you have it on a different
server, you will know what IP to substitute it with.

```bash
session.save_handler = memcache
session.save_path = "tcp://127.0.0.1:11211"
```

Done! Huh? what just happened?

Well, because we enabled Memcache session handler support, all work is done
for us. PHP will now know not to use the default `files` handler so save
session files in `/var/lib/php5` but uses memcache running at 127.0.0.1
instead.

Don't forget to restart your webserver to activate your changes to the
`php.ini`

```bash
$ /etc/init.d/apache2 restart
```

## The Catch

**Update** - As Manuel de Ruiter says in the comments, the following is no
longer true thanks to [some updates](https://pecl.php.net/package-changelog.php?package=memcache)

As with anything too cool, there is a catch: Locking. The standard PHP Session
module locks the whole session until the request finishes. Memcache is build
for speed and as a result, does not support this kind of locking. This could
lead to problems when using frames or ajax. Some actions may request a
session-variable before it's actually saved.

## Further Reading

What we have just done with Memcache is the low-hanging fruit. We've enabled
RAM sessions with minimum effort and without changing even one line of your
existing code.

But now that you have memcache running, you might want to use it for storing
other often-used, rarely-changed variables as well. Feel free to expiriment
and review [the documentation](https://www.php.net/manual/en/book.memcache.php). You will learn that Memcache can enhance
your serverside performance dramatically.
]]></content:encoded>
      <dc:date>2008-06-22T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>My New IDE: Eclipse PDT</title>
      <link>https://kvz.io/my-new-ide-eclipse-pdt.html</link>
      <description><![CDATA[I've been programming a lot with Quanta which is a leightweight kdevelop based
IDE. It did the trick for quite some time, but recent developments in my
coding life like SVN brought me on a Quest for my new ultimate PHP
IDE.
]]></description>
      <pubDate>Fri, 11 Apr 2008 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/my-new-ide-eclipse-pdt.html</guid>
      <content:encoded><![CDATA[I've been programming a lot with Quanta which is a leightweight kdevelop based
[IDE](/categories/ide/). It did the trick for quite some time, but recent developments in my
coding life like [SVN](/categories/svn/) brought me on a Quest for my new ultimate [PHP](/categories/php/)
IDE.

<!--more-->

## A New Editor

Obviously a new editor takes some getting used to. You got to get to know
eachother. You need to be open minded about it, not fear the change, and don't
get discouraged if you aren't that productive in this first phase.

## Eclipse PDT

I tried a lot of different editors. Most lightweight that I tried had features
similar to Quanta. I needed something more serious. I heared TextMate was
awesome, but buying a Mac... There had to be an easier way ; )

And so on my quest, I stumbled upon [Eclipse](/categories/eclipse/) PDT (**P**HP **D**evelopment
**T**ools). Eclipse has been around for a long time and is used mostly by Java
developers.

But now there is the PDT flavor. It's a plugin (they call it a 'perspective')
that focuses the raw power of Eclipse on PHP. Thus reducing a lot of clutter
in the interface, and bringing a lot of dedicated PHP features.

## What I Liked at First Sight

Though Eclipse and I had a rough start, I did see a glimpse of it's strong
points. The fact that it was:

 -
Open source
 \-
Actively maintained by a large community
 \-
Being used by a lot of professional Java developers
 \-
Cross platform (nicely integrates with Ubuntu in my case)

.. made sure that I was willing to give this IDE a serious shot.

## What I Like

Now, after a month of coding I can say I've indeed increased my productivity a
lot, and I'm really starting to like this IDE for making my life easier. Every
day. Why?

Eclipse supports:

 -
All platforms (windows, linux, mac)
 \-
Great defaults

- Yet properties are extensively customizable and can be saved per project

   -
  SVN Integration
   \-
  Trac Integration (!)
   \-
  PHPDoc comment blocks
   \-
  Jump to function declarations
   \-
  Moving around entire blocks of code with only ALT + cursors
   \-
  Customizable templates with intelligent markers for variables
   \-
  Advanced code completion & indentation
   \-
  Integrated PHP manual (begins with tooltip, F2 for extended info)
   \-
  Intelligent expanding / collapsing of code
   \-
  Easy creation of plugins, thus:

- Loads of freely available plugins ranging from code management & syntax
  highlighting of exotic languages

In short, I make less hand movement, but produce more & better code ; )

## What I Dislike

### Setting It Up Can Be a Pain

More specifically, installing was as easy.

[Download the PDT All-In-One package](https://download.eclipse.org/tools/pdt/downloads/release.php?release=S2) and extract it.

There, you've successfully installed Eclipse.

But then if you want Bash & SVN support for example, things tend to explode in
your face and leave you heavily mutilated behind your keyboard.

### The Feature Updates

The idea of their integrated package management system for updating &
installing new components is great.

It allows you to add extra (third party) mirrors and then install & update
(third party) components automatically.

But you really have to know the ins & outs if you don't want to stumble on a
load of errors, mirrors that are down, strange dependency resolving issues,
etc. These are errors that might pop up after half an hour of downloading, and
will make you start all over again.

Knowing:

 -
Exactly what mirrors to add
 \-
What packages to click on
 \-
How to select their dependencies
 \-
Which automatically selected dependencies to explicitly deselect (?!)

..will make all the difference. But you won't know until you've tried. If
people show an interest I might try to write my experiences down in another
article one day.

### Eclipse PDT Eats RAM & CPU for Breakfast.

Don't try running Eclipse on anything less than a recent Intel Core 2 Duo with
2GB ram. Trust me. I did.

Especially when working on large projects and enabling lots of components
like: SVN, HTML Tidy, syntax validation, etc. you need a fast workstation or
all of the delays will frustrate you.

## Conclusion

After installing it 5 times (also on different systems in my case) you get to
know the ins & outs of the **Feature Updates** system. It's a bit of a
minefield.

But hey, the hours you spend on tweaking your IDE to sheer perfection, pay off
the moment you start coding with it.

Here in the Netherlands we have a saying: Good tools cut work in half. Freely
translated, that is. And that's exactly what Eclipse PDT has done for me.

0071213-M1
]]></content:encoded>
      <dc:date>2008-04-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Better Performance With mod_deflate</title>
      <link>https://kvz.io/better-performance-with-mod-deflate.html</link>
      <description><![CDATA[I used to use Dean Edwards Javascript Packer a lot to compress my Javascript
sources. Libraries of 100kB could easily shrink to 30kB and that saves load
times &amp; bandwidth. A good writeup by Julien Lecompte made me realize that
there were better ways.
]]></description>
      <pubDate>Sat, 29 Mar 2008 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/better-performance-with-mod-deflate.html</guid>
      <content:encoded><![CDATA[I used to use Dean Edwards Javascript Packer a lot to compress my Javascript
sources. Libraries of 100kB could easily shrink to 30kB and that saves load
times & bandwidth. A good [writeup by Julien Lecompte](https://www.julienlecomte.net/blog/2007/08/13/) made me realize that
there were better ways.

<!--more-->

## Javascript Compressors

[Packer](https://dean.edwards.name/packer/) is an awesome utility that actually compresses code just like e.g.
rar would. Only then let [Javascript](/categories/javascript/) unpack the compressed string and
execute it with eval. The file sizes of packer where always tinier than other
utilities like jsmin, because [jsmin](https://javascript.crockford.com/jsmin.html) just strips whitespace, etc. The
advantages of jsmin where of course that it respects your original code
better. And there's no time wasted in decompressing, jsmin results can be
interpreted directly.

In short, the article [Gzip Your Minified JavaScript Files](https://www.julienlecomte.net/blog/2007/08/13/) changes the
advantages & disadvantages between the compressors. Gzipped jsmin files are
just as big as gzipped packer files. But because packer files have to be
decompressed by Javascript, the packer file in the end is slower than the
jsmin file which can be executed right away.

Okay, so letting your webserver & browser handle compressing & decompressing
is faster then letting Javascript do it. It does not only save you bandwidth,
it also makes the user experience snapier. And jsmin has better performance
than packer. Sounds reasonable right?

So first step is to jsmin your original Javascript files. [That's easy.](https://javascript.crockford.com/jsmin.html)

Second step is to enable automatic compression. Well that's pretty easy too.

## Enable mod\_deflate

I'm using [Apache](/categories/apache/) on Linux. Usually the Apache module: mod\_deflate will be
enabled by default. If not you have to enable it like this:

```bash
$ a2enmod deflate
$ /etc/init.d/apache2 force-reload
```

## Instructing Apache What to Compress

Next we need to tell the webserver what file types need to be deflated. This
can be done by either:

- creating an `.htaccess` file in your webroot, OR:
- modifying your VHost

Some people argue that configuration in the VHost is better because it saves
your server the [disk IO](/categories/io/) of accessing the [.htaccess](/categories/htaccess/) file with every
request. But for VHost configuration you will need to be admin of your server,
and an .htaccess file is as easy as uploading one with FTP. You choose.

Enter the following lines in one of the above files. And why not
compress/deflate HTML, XML & CSS while we're at it?

```bash
# Compress Output
AddOutputFilterByType DEFLATE text/html text/css text/plain text/xml application/x-javascript
```

```bash
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4\.0[678] no-gzip
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
```

The browser specific exceptions are necessary for compatibility.

Save the file, optionally (in the case of vhost) reload the apache config, and
your files are now compressed on the fly! Do you notice the difference?
]]></content:encoded>
      <dc:date>2008-03-29T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Determine SID of Windows User</title>
      <link>https://kvz.io/determine-sid-of-windows-user.html</link>
      <description><![CDATA[Sometimes when digging real deep into Windows like I recently had to, you need to have the Windows SID (Security Identifier) of a local user. I wasn't able to find any standard way of obtaining this info, so I wrote this little VBScript. Might help some people, might not. Putting this online anyway ; )
]]></description>
      <pubDate>Wed, 26 Mar 2008 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/determine-sid-of-windows-user.html</guid>
      <content:encoded><![CDATA[Sometimes when digging real deep into Windows [like I recently had to](/blog/2008/03/26/allow-windows-users-to-restart-service/), you need to have the [Windows](/categories/windows/) [SID](/categories/sid/) (Security Identifier) of a local user. I wasn't able to find any standard way of obtaining this info, so I wrote this little [VBScript](/categories/vbscript/). Might help some people, might not. Putting this online anyway ; )

<!--more-->

Open notepad and paste the following script:

```vbnet
strComputer = "<COMPUTERNAME>"
strUser = "<USERNAME>"
Set objWMIService = GetObject("winmgmts:\\" & strComputer & "\root\cimv2")
Set objAccount = objWMIService.Get ("Win32_UserAccount.Name='" & strUser & "',Domain='" & strComputer & "'")
Wscript.Echo objAccount.SID
```

Obviously,

- Change the <COMPUTERNAME> and <USERNAME>
- Save with .vbs extension (Like getsid.vbs)
- Execute it
]]></content:encoded>
      <dc:date>2008-03-26T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Allow Windows Users to Restart Service</title>
      <link>https://kvz.io/allow-windows-users-to-restart-service.html</link>
      <description><![CDATA[Let's say you want your local restricted users to be able to restart specific
services. On linux you'd probably type visudo. In Windows I found, you
have to dig a little deeper into the system and really do your research. I
needed several sites, programs and articles. So I thought it might be useful
to others if I'd bundle all the required information in one place. Here it is.
]]></description>
      <pubDate>Wed, 26 Mar 2008 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/allow-windows-users-to-restart-service.html</guid>
      <content:encoded><![CDATA[Let's say you want your local restricted users to be able to restart specific
services. On linux you'd probably type visudo. In [Windows](/categories/windows/) I found, you
have to dig a little deeper into the system and really do your research. I
needed several sites, programs and articles. So I thought it might be useful
to others if I'd bundle all the required information in one place. Here it is.

<!--more-->

### Warning

This was tested on a Windows 2003 Server STD. It may not work on other
versions. Also, this is serious stuff. You can seriously mess up your system
using these pointers. Study before implementing anything. I warned you.

## Prerequisite: Resource Kit Tools

In this article we're going to change SDDL properties of certain objects. We
can do this with a tool called: sc.exe. It's distributed with the Windows
Server 2003 Resource Kit Tools.

So first we need to:

- [Download](https://www.microsoft.com/downloads/details.aspx?FamilyID=9D467A69) the Windows Server 2003 Resource Kit Tools
- Install it
- Open a command prompt (`cmd.exe`)
- Change to installation directory (`cd "C:\Program Files\Windows Resource Kits\Tools"`)

## Prerequisite: Access to SC Manager

Your users need to be able to access this service as a prerequisite. If you
want your changes to be user specific, you might first want to [determine the
SID of a user](/blog/2008/03/26/determine-sid-of-windows-user/). This might return:

S-1-5-21-151122097-1987018581-353216475-1003

We can optionally use this SID later on.

### Lookup Current Scmanager SDDL

The security descriptor definition language ([SDDL](/categories/sddl/)) defines who is allowed
to do what. If we are going to change that (in this case for the scmanager),
we first want to see what the original SDDL is. So in the Resource Kit Tools
directory execute:

```bash
$ sc sdshow scmanager
```

And that might return:

```bash
D:(A;;CC;;;AU)*(A;;CCLCRPRC;;;IU)*(A;;CCLCRPRC;;;SU)
(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)
S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD)
```

(without the linebreaks)

### Change Scmanager SDDL

Now based on the original SDDL of scmanager, we're going to create a new one
that includes our user ([determine the SID of a user](/blog/2008/03/26/determine-sid-of-windows-user/)) by following these
rules:

- Copy the **I**nteractive **U**ser ACE string `(A;;CCLCRPRC;;;IU)`
- Change the IU to the SID of the user or group that you wish to grant access `(A;;CCLCRPRC;;;*S-1-5-21-151122097-1987018581-353216475-1003*)`
- Insert the new ACE string before the S: like so

```bash
D:(A;;CC;;;AU)(A;;CCLCRPRC;;;IU)(A;;CCLCRPRC;;;SU)
(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)
*(A;;CCLCRPRC;;;S-1-5-21-151122097-1987018581-353216475-1003)*
S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD)
```

(without the linebreaks)

### Set New Scmanager SDDL

In the Resource Kit Tools directory execute:

```bash
$ sc sdset scmanager "D:(A;;CC;;;AU)(A;;CCLCRPRC;;;IU)
(A;;CCLCRPRC;;;SU)(A;;CCLCRPWPRC;;;SY)(A;;KA;;;BA)
_(A;;CCLCRPRC;;;S-1-5-21-151122097-1987018581-353216475-1003)_
S:(AU;FA;KA;;;WD)(AU;OIIOFA;GA;;;WD)"
```

(without the linebreaks)

Your user now has remote access to the scmanager.

## Access to Your Service

Now we must grant users the right to start and stop your service. Let's take
Tomcat for example.

### Lookup Key Name

First we must lookup the internal service key. This is not always what is
displayed in the user interface. To find this key, in the Resource Kit Tools
directory execute:

```bash
$ sc GetKeyName "Apache Tomcat"
```

And that might return: `Tomcat5`

### Allow All Authenticated Users to Restart Service

We've already seen how to isolate a specific user. In the next example let's
allow all Authenticated Users (a.k.a. everyone / world) to start, stop &
query. In the Resource Kit Tools directory execute:

```bash
$ sc sdset Tomcat5 "D:AR(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;BA)
(A;;LCRPWP;;;AU)(A;;CCLCSWLOCRRC;;;IU)(A;;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;SY)
S:(AU;FA;CCDCLCSWRPWPDTLOCRSDRCWDWO;;;WD)"
```

(without the linebreaks)

Voila! Your users have permission to start and stop the service, even though
they are just restricted users. Why not test it by logging in as a restricted
user and restarting your service?

## More Options

If my examples do not cut it for you, then you'll have to familiarize yourself
with the Security Descriptor Definition Language (SDDL), here are some useful
sources to get you going.

### Sources

- [Introduction to SDDL](https://blogs.dirteam.com/blogs/jorge/archive/2008/03/26/parsing-sddl-)
- [SDDL syntax](https://www.washington.edu/computing/support/windows/UWdomains/SDDL.html)
- [SDDL Parse](https://blogs.microsoft.co.il/files/folders/guyt/entry70399.aspx)

As always, if I overlooked something, you know better ways or find errors,
please let me know!
]]></content:encoded>
      <dc:date>2008-03-26T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Class 'Imagick' Not Found</title>
      <link>https://kvz.io/class-imagick-not-found.html</link>
      <description><![CDATA[I tried to do some Image Magick with PHP recently on an Ubuntu
Feisty machine, and even though I had the required package: 'php5-imagick'
installed, and I updated my php.ini with imagick.so, I kept getting the
error Class 'Imagick' not found. This is how I eventually fixed it.
]]></description>
      <pubDate>Wed, 27 Feb 2008 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/class-imagick-not-found.html</guid>
      <content:encoded><![CDATA[I tried to do some [Image Magick](/categories/imagemagick/) with [PHP](/categories/php/) recently on an Ubuntu
Feisty machine, and even though I had the required package: 'php5-imagick'
installed, and I updated my `php.ini` with `imagick.so`, I kept getting the
error `Class 'Imagick' not found`. This is how I eventually fixed it.

<!--more-->

The idea is to fall back to the good old [PECL](/categories/pecl/) installation:

```bash
sudo aptitude install make php5-dev php-pear
sudo aptitude remove php5-imagick
sudo aptitude install libmagick9-dev
sudo pecl install imagick
sudo /etc/init.d/apache2 restart
```

And don't forget to put your [imagick](/categories/imagick/).so in your php.ini!

Well, that did it for me. No need for a long article this time, but this may
help others experiencing similar problems.
]]></content:encoded>
      <dc:date>2008-02-27T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>PHP: tiff2pdf</title>
      <link>https://kvz.io/php-tiff2pdf.html</link>
      <description><![CDATA[Or: How to convert multipage TIFF to PDF in PHP.
]]></description>
      <pubDate>Wed, 28 Nov 2007 00:00:00 +0100</pubDate>
      <guid>https://kvz.io/php-tiff2pdf.html</guid>
      <content:encoded><![CDATA[Or: **How to convert multipage TIFF to PDF in PHP.**

<!--more-->

Let's say you have a fax with multiple pages that has been stored as a
[TIFF](/categories/tiff/) and you want to convert it to [PDF](/categories/pdf/) using [PHP](/categories/php/) for digital
document flow. In this article I will show you a `tiff2pdf` function for PHP,
because it cannot be done directly with [ImageMagick](/categories/imagemagick/).

## Requirements

- php5 (I use *php5-cli* for running php from the command line)
- Imagick (native PHP extension for ImageMagick available through [PECL](/categories/pecl/))
- ps2pdfwr (gs-common)

In [Ubuntu](/categories/ubuntu/), this would translate to:

```bash
sudo aptitude update \
 && sudo aptitude install make php5-cli php5-gd php5-dev php-pear gs-common ghostscript \
 && sudo aptitude remove php5-imagick \
 && sudo apt-get install libmagick9-dev \
 && sudo pecl install imagick
```

Why use PECL to install [Imagick](/categories/imagick/) and not apt you say? Because currently,
Imagick from Ubuntu Gutsy repositories contains a [nasty bug](https://ubuntuforums.org/showthread.php?t=573878).

## Function

You can just copy & paste this and check out the example below, or read the
comments if you want to understand how it works.

```php
<?php
function tiff2pdf($file_tif, $file_pdf){
    // Initialize
    $errors     = array();
    $cmd_ps2pdf = "/usr/bin/ps2pdfwr";
    $file_tif   = escapeshellarg($file_tif);
    $file_pdf   = escapeshellarg($file_pdf);

    // Initial Error handling
    if (!file_exists($file_tif)) $errors[] = "Original TIFF file:".$file_tif." does not exist";
    if (!file_exists($cmd_ps2pdf)) $errors[] = "Ghostscript PostScript to PDF converter not found at: ".$cmd_ps2pdf;
    if (!extension_loaded("imagick")) $errors[] = "Imagick extension not installed or not loaded";
    // to include the imagick extension dynamically use an optional:

    dl('imagick.so');
    // Only continue if there aren't any errors
    if (!count($errors)) {
        // Determine the file base
        $base = $file_pdf;
        if(($ext = strrchr($file_pdf, '.')) !== false) $base = substr($file_pdf, 0, -strlen($ext));

        // Determine the temporary .ps filepath
        $file_ps = $base.".ps";

        // Open the original .tiff
        $document = new Imagick($file_tif);

        // Use Imagick to write multiple pages to 1 .ps file
        if (!$document->writeImages($file_ps, true)) {
            $errors[] = "Unable to use Imagick to write multiple pages to 1  .ps file: ".$file_ps;
        } else {
            $document->clear();
            // Use ghostscript to convert .ps -> .pdf
            exec($cmd_ps2pdf." -sPAPERSIZE=a4 ".$file_ps." ".$file_pdf, $o, $r);

            if ($r) {
                $errors[] = "Unable to use ghostscript to convert .ps(".$file_ps.") -> .pdf(".$file_pdf."). Check rights. ";
            }
        }
    }

    // return array with errors, or true with success.
    if (!count($errors)) {
        return true;
    } else {
        return $errors;
    }
}
?>
```

## Example

This is how you could call the function

```php
<?php
// converts /dir/fax.tiff to /dir/fax.pdf
if (($return = tiff2pdf("/dir/fax.tif", "/dir/fax.pdf")) !== true) {
    // error
    echo "Error:\n";
    print_r($return);
} else {
    // success
    echo "success!\n";
}
?>
```

## Read on for More Background Info

People are usually rushing for a quick solution so that's why I split up my
article and will put all the background information here. So **for the curious:**

## Approach

Every time I've directly tried to convert any format to PDF with only
ImageMagick, it has brought me nothing more than distorted files.

There's little documentation about doing this in PHP but the key in my
approach is in using 2 steps.

- `tiff2ps`: Convert TIFF to PostScript using Imagick
- `ps2pdf`: Convert PostScript to PDF using Ghostscript

## Imagick (The tiff2ps Step)

Imagick is a native PHP extension to create and modify images using the
ImageMagick API. It's [twice as fast](https://valokuva.org/?p=40) as making system calls to ImageMagick
commands and in this case I am using Imagick to create the in-between `.ps`
(PostScript) file.

### About Imagick's Syntax change

Imagick recently changed quite a bit. I was used to simply call:

```php
<?php
$image = imagick_readimage("/dir/file1");
imagick_writeimage($image, "/dir/file2");
?>
```

But nowadays, Imagick has become object oriented and the correct syntax is:

```php
<?php
$image = new Imagick("/dir/file1");
$image->writeImage("/dir/file2");
?>
```

Though I greatly approve of this change as it offers great flexibility:

```php
<?php
// Make a thumbnail of all JPG files in a directory
$images = new Imagick(glob('images/*.jpg'));
foreach($images as $image) {
    // Providing 0 forces thumbnailImage to maintain aspect ratio
    $image->thumbnailImage(1024,0);
}
$images->writeImages();

// from: https://nl3.php.net/imagick
?>
```

.. it does force you to recode your existing scripts.

### About Handling Multipage Documents

As you probably know, one TIFF is capable of having multiple pages. Load a
multipage TIFF and Imagick stores every page separately. So the code above
that used for thumbnailing multiple files with in one dir with glob, could
just as well be used for looping over the pages in our document like so:

```php
<?php
// Saving every page of a TIFF separately as a JPG thumbnail
$images = new Imagick("/dir/file1.tif");
foreach($images as $i=>$image) {
    // Providing 0 forces thumbnailImage to maintain aspect ratio
    $image->thumbnailImage(1024,0);
    $image->writeImage("/dir/file1_page".$i.".jpg");
}

$images->clear();
?>
```

This also explains a problem I ran into. When I tried to store the TIFF to PS,
I first just basically used:

```php
<?php
$image = new Imagick("/dir/file1.tif");
$image->writeImage("/dir/file2.ps");
?>
```

The above resulted in only the first page of my TIFF being saved in the
PostScript file.

The problem was solved by simply using `writeImages` like this:

```php
<?php
$image = new Imagick("/dir/file1.tif");
$image->writeImages("/dir/file2.ps");
?>
```

One letter can make a big difference.

## Ghostscript (The ps2pdf Step)

Ghostscript is a suite of software based on an interpreter for Adobe Systems' PostScript and Portable Document Format (PDF) page description languages.

Unfortunately to convert the `.ps` to `.pdf` we still have to make one system
call to `ps2pdfwr`, which is a Ghostscript command included in the *gs-common*
package.
]]></content:encoded>
      <dc:date>2007-11-28T00:00:00+01:00</dc:date>
    </item>
    <item>
      <title>Disable Snapping Windows in Compiz-Fusion</title>
      <link>https://kvz.io/disable-snapping-windows-in-compizfusion.html</link>
      <description><![CDATA[Running compiz-fusion for some time, one thing started to annoy me. Snapping
windows. The first thing I obviously looked for was the Snapping Windows
Plugin. But that was already disabled.
]]></description>
      <pubDate>Thu, 18 Oct 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/disable-snapping-windows-in-compizfusion.html</guid>
      <content:encoded><![CDATA[Running compiz-fusion for some time, one thing started to annoy me. Snapping
windows. The first thing I obviously looked for was the Snapping Windows
Plugin. But that was already disabled.

<!--more-->

I'm blogging the setting that controls this behavior because it took me some
time to find it, I think other people may find it contra productive as well.

No need for a long article this time.

It's a subsetting of the Wobbly Windows plugin called: Snap Inverted. Turn it
off and the snapping stops.
]]></content:encoded>
      <dc:date>2007-10-18T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Convert Anything to Tree Structures in PHP</title>
      <link>https://kvz.io/convert-anything-to-tree-structures-in-php.html</link>
      <description><![CDATA[I recently faced a programming challenge that almost broke my brain. I
needed to create a function that could explode any single-dimensional
array into a full blown tree structure, based on the delimiters
found in it's keys. Tricky part was size of the tree could be infinite. I
called the function: explodeTree. And maybe it's best to first look at an
example.
]]></description>
      <pubDate>Wed, 03 Oct 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/convert-anything-to-tree-structures-in-php.html</guid>
      <content:encoded><![CDATA[I recently faced a [programming](/categories/programming/) challenge that almost broke my brain. I
needed to create a function that could explode any single-dimensional
[array](/categories/array/) into a full blown [tree](/categories/tree/) structure, based on the [delimiters](/categories/delimiter/)
found in it's keys. Tricky part was size of the tree could be infinite. I
called the function: `explodeTree`. And maybe it's best to first look at an
example.

<!--more-->

## The Directory Example

Here I will give an example what the `explodeTree` function could be used for.
Let's say we need a recursive directory listing of */etc/php5*, and for that
we execute:

```php
<?php
if(exec("find /etc/php5", $files)){
    // the $files array now holds the path as it's values,
    // but we also want the paths as keys:
    $key_files = array_combine(array_values($files), array_values($files));

    // show the array
    print_r($key_files);
}
?>
```

Which will return something like:

```php
Array
(
    [/etc/php5] => /etc/php5
    [/etc/php5/cli] => /etc/php5/cli
    [/etc/php5/cli/conf.d] => /etc/php5/cli/conf.d
    [/etc/php5/cli/php.ini] => /etc/php5/cli/php.ini
    [/etc/php5/conf.d] => /etc/php5/conf.d
    [/etc/php5/conf.d/mysqli.ini] => /etc/php5/conf.d/mysqli.ini
    [/etc/php5/conf.d/curl.ini] => /etc/php5/conf.d/curl.ini
    [/etc/php5/conf.d/snmp.ini] => /etc/php5/conf.d/snmp.ini
    [/etc/php5/conf.d/gd.ini] => /etc/php5/conf.d/gd.ini
    [/etc/php5/apache2] => /etc/php5/apache2
    [/etc/php5/apache2/conf.d] => /etc/php5/apache2/conf.d
    [/etc/php5/apache2/php.ini] => /etc/php5/apache2/php.ini
)
```

Now if we want to transform this list into a tree structure with each
directory as a nested node, a child of another directory, all we would have to
do is run:

```php
<?php
// let '/' be our delimiter
$tree = explodeTree($key_files, "/");
// show the array
print_r($tree);
?>
```

And that single command would give the totally awesome:

```php
Array
(
    [etc] => Array
        (
            [php5] => Array
                (
                    [cli] => Array
                        (
                            [conf.d] => /etc/php5/cli/conf.d
                            [php.ini] => /etc/php5/cli/php.ini
                        )
                    [conf.d] => Array
                        (
                            [mysqli.ini] => /etc/php5/conf.d/mysqli.ini
                            [curl.ini] => /etc/php5/conf.d/curl.ini
                            [snmp.ini] => /etc/php5/conf.d/snmp.ini
                            [gd.ini] => /etc/php5/conf.d/gd.ini
                        )

                    [apache2] => Array
                        (
                            [conf.d] => /etc/php5/apache2/conf.d
                            [php.ini] => /etc/php5/apache2/php.ini
                        )
                )
        )
)
```

**Wow!** So this would make it very easy to visually layout a tree structure
of the directory `/etc/php5`. But remember this is *just* an example. The
function now explodes on the '/' character, but you can use any delimiter to
explode a single-dimensional array into a Tree. So how does this `explodeTree`
function work?

## The Function: explodeTree()

Thanks to [Lachlan Donald](https://www.lachlandonald.com) and *Takkie*,
for contributing to this() function.

```php
<?php
/**
 * Explode any single-dimensional array into a full blown tree structure,
 * based on the delimiters found in it's keys.
 *
 * The following code block can be utilized by PEAR's Testing_DocTest
 * <code>
 * // Input //
 * $key_files = array(
 *	 "/etc/php5" => "/etc/php5",
 *	 "/etc/php5/cli" => "/etc/php5/cli",
 *	 "/etc/php5/cli/conf.d" => "/etc/php5/cli/conf.d",
 *	 "/etc/php5/cli/php.ini" => "/etc/php5/cli/php.ini",
 *	 "/etc/php5/conf.d" => "/etc/php5/conf.d",
 *	 "/etc/php5/conf.d/mysqli.ini" => "/etc/php5/conf.d/mysqli.ini",
 *	 "/etc/php5/conf.d/curl.ini" => "/etc/php5/conf.d/curl.ini",
 *	 "/etc/php5/conf.d/snmp.ini" => "/etc/php5/conf.d/snmp.ini",
 *	 "/etc/php5/conf.d/gd.ini" => "/etc/php5/conf.d/gd.ini",
 *	 "/etc/php5/apache2" => "/etc/php5/apache2",
 *	 "/etc/php5/apache2/conf.d" => "/etc/php5/apache2/conf.d",
 *	 "/etc/php5/apache2/php.ini" => "/etc/php5/apache2/php.ini"
 * );
 *
 * // Execute //
 * $tree = explodeTree($key_files, "/", true);
 *
 * // Show //
 * print_r($tree);
 *
 * // expects:
 * // Array
 * // (
 * //	 [etc] => Array
 * //		 (
 * //			 [php5] => Array
 * //				 (
 * //					 [__base_val] => /etc/php5
 * //					 [cli] => Array
 * //						 (
 * //							 [__base_val] => /etc/php5/cli
 * //							 [conf.d] => /etc/php5/cli/conf.d
 * //							 [php.ini] => /etc/php5/cli/php.ini
 * //						 )
 * //
 * //					 [conf.d] => Array
 * //						 (
 * //							 [__base_val] => /etc/php5/conf.d
 * //							 [mysqli.ini] => /etc/php5/conf.d/mysqli.ini
 * //							 [curl.ini] => /etc/php5/conf.d/curl.ini
 * //							 [snmp.ini] => /etc/php5/conf.d/snmp.ini
 * //							 [gd.ini] => /etc/php5/conf.d/gd.ini
 * //						 )
 * //
 * //					 [apache2] => Array
 * //						 (
 * //							 [__base_val] => /etc/php5/apache2
 * //							 [conf.d] => /etc/php5/apache2/conf.d
 * //							 [php.ini] => /etc/php5/apache2/php.ini
 * //						 )
 * //
 * //				 )
 * //
 * //		 )
 * //
 * // )
 * </code>
 *
 * @author	Kevin van Zonneveld <kevin@vanzonneveld.net>
 * @author	Lachlan Donald
 * @author	Takkie
 * @copyright 2008 Kevin van Zonneveld (https://kevin.vanzonneveld.net)
 * @license   https://www.opensource.org/licenses/bsd-license.php New BSD Licence
 * @version   SVN: Release: $Id: explodeTree.inc.php 89 2008-09-05 20:52:48Z kevin $
 * @link	  https://kevin.vanzonneveld.net/
 *
 * @param array   $array
 * @param string  $delimiter
 * @param boolean $baseval
 *
 * @return array
 */
function explodeTree($array, $delimiter = '_', $baseval = false)
{
	if(!is_array($array)) return false;
	$splitRE   = '/' . preg_quote($delimiter, '/') . '/';
	$returnArr = array();
	foreach ($array as $key => $val) {
		// Get parent parts and the current leaf
		$parts	= preg_split($splitRE, $key, -1, PREG_SPLIT_NO_EMPTY);
		$leafPart = array_pop($parts);

		// Build parent structure
		// Might be slow for really deep and large structures
		$parentArr = &$returnArr;
		foreach ($parts as $part) {
			if (!isset($parentArr[$part])) {
				$parentArr[$part] = array();
			} elseif (!is_array($parentArr[$part])) {
				if ($baseval) {
					$parentArr[$part] = array('__base_val' => $parentArr[$part]);
				} else {
					$parentArr[$part] = array();
				}
			}
			$parentArr = &$parentArr[$part];
		}

		// Add the final part to the structure
		if (empty($parentArr[$leafPart])) {
			$parentArr[$leafPart] = $val;
		} elseif ($baseval && is_array($parentArr[$leafPart])) {
			$parentArr[$leafPart]['__base_val'] = $val;
		}
	}
	return $returnArr;
}
?>
```

The first to arguments of `explodeTree()` are clear I guess. But what about
that 3rd parameter: `$baseval`?

## The Baseval Argument

In the first example you see that only leafs (the bottom nodes that don't have
any children) maintain their original values (the filepaths in this case). If
you want higher nodes (parents) to also maintain their values, you'll have to
tell `explodeTree` to do so like this:

```php
<?php
// now the 3rd argument, the baseval, is true
$tree = explodeTree($key_files, "/", true);
?>
```

And then `explodeTree` will preserve the node's original value in the
`__base_val` items. Like this:

```php
Array
(
    [etc] => Array
        (
            [__base_val] =>
            [php5] => Array
                (
                    [__base_val] => /etc/php5
                    [cli] => Array
                        (
                            [__base_val] => /etc/php5/cli
                            [conf.d] => /etc/php5/cli/conf.d
                            [php.ini] => /etc/php5/cli/php.ini
                        )

                    [conf.d] => Array
                        (
                            [__base_val] => /etc/php5/conf.d
                            [mysqli.ini] => /etc/php5/conf.d/mysqli.ini
                            [curl.ini] => /etc/php5/conf.d/curl.ini
                            [snmp.ini] => /etc/php5/conf.d/snmp.ini
                            [gd.ini] => /etc/php5/conf.d/gd.ini
                        )
                    [apache2] => Array
                        (
                            [__base_val] => /etc/php5/apache2
                            [conf.d] => /etc/php5/apache2/conf.d
                            [php.ini] => /etc/php5/apache2/php.ini
                        )

                )

        )

)
```

See what happens? Baseval creates a placeholder. A semi-node for the original
value of it's parent. The value: `'/etc/php5'` is now saved, without baseval
this value would be lost because there was no place to store it in.
That might come in handy!

## So you've got a tree. Now what?

Trees with unlimited levels of **nodes** require **recursive functions** that
can traverse the entire structure. Recursive functions are functions that call
themselves every time they find more items to process. Here's one to layout
the directories:

```php
<?php
function plotTree($arr, $indent=0, $mother_run=true){
    if ($mother_run) {
        // the beginning of plotTree. We're at rootlevel
        echo "start\n";
    }

    foreach ($arr as $k=>$v){
        // skip the baseval thingy. Not a real node.
        if ($k == "__base_val") continue;
        // determine the real value of this node.
        $show_val = (is_array($v) ? $v["__base_val"] : $v);
        // show the indents
        echo str_repeat("  ", $indent);
        if ($indent == 0) {
            // this is a root node. no parents
            echo "O ";
        } elseif (is_array($v)){
            // this is a normal node. parents and children
            echo "+ ";
        } else {
            // this is a leaf node. no children
            echo "- ";
        }

        // show the actual node
        echo $k . " (" . $show_val. ")" . "\n";
        if (is_array($v)) {
            // this is what makes it recursive, rerun for childs
            plotTree($v, ($indent+1), false);
        }
    }

    if ($mother_run) {
        echo "end\n";
    }
}
?>
```

And this would output:

```bash
start
O etc ()
  + php5 (/etc/php5)
    + cli (/etc/php5/cli)
      - conf.d (/etc/php5/cli/conf.d)
      - php.ini (/etc/php5/cli/php.ini)
    + conf.d (/etc/php5/conf.d)
      - mysqli.ini (/etc/php5/conf.d/mysqli.ini)
      - curl.ini (/etc/php5/conf.d/curl.ini)
      - snmp.ini (/etc/php5/conf.d/snmp.ini)
      - gd.ini (/etc/php5/conf.d/gd.ini)
    + apache2 (/etc/php5/apache2)
      - conf.d (/etc/php5/apache2/conf.d)
      - php.ini (/etc/php5/apache2/php.ini)
end
```

If I overlooked a standard PHP function that can already do this, or you have
other improvements/ideas leave a comment!

Thanks again: [Lachlan Donald](https://www.lachlandonald.com) & Tokkie for insightful comments and great effort.
]]></content:encoded>
      <dc:date>2007-10-03T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Fit More on One Screen Using DPI</title>
      <link>https://kvz.io/fit-more-on-one-screen-using-dpi.html</link>
      <description><![CDATA[A couple of years ago when everyone still had giant CRT monitors, resolutions
of 1600x1200 were pretty common. Nowadays however 19" TFT monitors often
cannot scale higher than 1280x1024. So how can we still fit more on one
screen? DPI can help!
]]></description>
      <pubDate>Wed, 29 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/fit-more-on-one-screen-using-dpi.html</guid>
      <content:encoded><![CDATA[A couple of years ago when everyone still had giant CRT monitors, resolutions
of 1600x1200 were pretty common. Nowadays however 19" TFT monitors often
cannot scale higher than 1280x1024. So how can we still fit more on one
screen? DPI can help!

<!--more-->

## About This Article

Though you might not really need the biggest resolutions, sometimes it would
be nice to be able to fit more on your screen. There's a trick in Gnome to
achieve this using the font Dots Per Inch settings. I use Ubuntu but it should
work on every Linux distribution with Gnome installed.

In this article we're not really going to change the resolution, in fact we're
going to change the font rendering details, making every letter a little
smaller, thus resulting in more space everywhere.

## How to Change

Click on the: *System -> Preferences -> Font* menu

Click on the *Details* button. This will open up *Font Rendering Details*

Now you can change the DPI setting. The default is 96 which is pretty big in
my opinion. I finally chose 72, but feel free to experiment like this:

![dpi.png](/assets/images/posts/dpi.png "dpi.png")

I did an upgrade to Gutsy two months ago, but Feisty works exactly the
same, the only difference is that Gutsy has a: *System -> Preferences ->
Appearance* menu, with a *Font* tab.
]]></content:encoded>
      <dc:date>2007-08-29T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Login Automatically With SSH Keys</title>
      <link>https://kvz.io/login-automatically-with-ssh-keys.html</link>
      <description><![CDATA[With SSH you can securely login to any Linux server and execute commands
remotely. You can even use SSH to transfer and
synchronize files from one server to another. Automating these tasks can make your life easier, but
normally SSH prevents that because it requires you to login every time. Well,
not anymore, in this article I will show you how to connect to SSH without a
password.
]]></description>
      <pubDate>Sun, 19 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/login-automatically-with-ssh-keys.html</guid>
      <content:encoded><![CDATA[With SSH you can securely login to any Linux server and execute commands
remotely. You can even use SSH to transfer and
[synchronize files from one server to another](/blog/2007/08/16/synchronize-files-with-rsync/). Automating these tasks can make your life easier, but
normally SSH prevents that because it requires you to login every time. Well,
not anymore, in this article I will show you how to connect to SSH without a
password.

<!--more-->

## About SSH Keys

SSH keys allow machines to identify each other without you having to type the
password every time. First we need to generate a key (it's nothing more than a
randomly generated sequence of bytes, see it as a fingerprint) on the machine
you're going to make the connection from. And then you install that unique key
on the machine that needs to accept the connection.

## Little Helper Script

Installing keys takes quite a couple of commands, not very easy to remember
either. And if you have multiple servers, you might even want to automate the
process of installing keys. No worries, I did this for you. So just download
[the helper script](https://raw.github.com/kvz/deprecated/kvzlib/bash/programs/instkey.sh) and install it. Open a terminal, and type:

```bash
$ su -  # If you're going to use the keys to automate tasks, become root first
$ mkdir -p ~/bin
$ curl https://raw.github.com/kvz/deprecated/kvzlib/bash/programs/instkey.sh -ko ~/bin/instkey.bash \
 && chmod u+x $_
```

## Running the Script: Installing Keys

Now with the script in place, installing SSH keys is easy. To allow easy
access to `server.example.com` just open a terminal and type:

```bash
$ ~/bin/instkey.bash server.example.com
```

The first time you run the script, it will create the necessary keys, when it
asks for a pass phrase, just hit enter. Then it logs in at
`server.example.com` (now you need to enter the server's password for the last
time ; ), and it saves the key.

### Installing ssh Keys Under a Different User

Make sure you are logged in as the user you want to have passwordless ssh
access. Let's say this user is called: `kevin`.

Goto the place you downloaded the instkey.sh script to, and type:

```bash
$ ./instkey.bash server.example.com kevin
```

Notice the second argument? This will make sure keys from kevin aren't
remotely installed to root, but to kevin as well. Easy right?

**Congratulations!** You can now type

```bash
$ ssh server.example.com
```

And you'll be logged in right away! Another great idea is to use this
technology
to [automatically synchronize files with rsync](/blog/2007/08/16/synchronize-files-with-rsync/).

## Pitfalls

- Of course you should really be carefull where and when to install ssh
  keys, because if one machine is compromised, it's very easy for a cracker to
  hop to the next system without logging in. So choose wisely when to use this
  technology.
- Keys are user specific. So if you're going to run programs as root
  that need to automatically login to systems, you must also install the key as
  root.
]]></content:encoded>
      <dc:date>2007-08-19T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Synchronize Files With rsync</title>
      <link>https://kvz.io/synchronize-files-with-rsync.html</link>
      <description><![CDATA[Synchronizing files from one server to another is quite awesome. You can
use it for backups, for keeping web servers in sync, and much more. It's fast and it doesn't take up as much bandwidth as normal copying would. And the
best thing is, it can be done with only 1 command. Welcome to the wonderful
world of rsync.
]]></description>
      <pubDate>Thu, 16 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/synchronize-files-with-rsync.html</guid>
      <content:encoded><![CDATA[[Synchronizing](/categories/synchronization/) files from one server to another is quite awesome. You can
use it for [backups](/categories/backup/), for keeping web servers in sync, and much more. It's fast and it doesn't take up as much bandwidth as normal copying would. And the
best thing is, it can be done with only 1 command. Welcome to the wonderful
world of [rsync](/categories/rsync/).

<!--more-->

## Installing rsync

On most modern [Linux](/categories/linux/) distributions you will find rsync comes
preinstalled. If that's not the case, just install it with your package
manager. On [Ubuntu](/categories/ubuntu/) this would look like:

```bash
$ aptitude -y install rsync
```

done!

## Simple - One Command

Let's copy our local `/home/kevin/source` to `/home/kevin/destination` which
resides on the server: `server.example.com`:

```bash
$ rsync -az --progress --size-only /home/kevin/source/* server.example.com:/home/kevin/destination/
```

**explained:**

- `-a` archive, preserves all attributes like recursive ownership, timestamps, etc
- `-z` compress, saves bandwidth but is harder on your CPU so use it for slow/expensive connections only
- `--progress` shows you the progress of all the files that are being synced
- `--size-only` compare files based on their size instead of hashes (less CPU, so faster)

Note that this sync excludes hidden files since it uses the bash `*`. If you want to include
hidden files, write the source like this: `/home/kevin/source/` and remove the trailing slash
from the destination like so: `/home/kevin/destination`.

Well, that's it! But read on if you want to learn how to automate this.

## Advanced - Automatic Syncing With SSH Keys

Alright so syncing files on Linux is pretty easy. But what if we want to
automate this? How can we avoid that rsync asks for a password every time?

There are different ways to go about this, but the one I mostly use is
installing SSH keys. By installing your SSH key on the destination
server, it will recognize you in the future and permit instant access. So this
way we can automate the synchronization with rsync.

### Easy Script

I've written another article explaining on [setting up SSH keys](/blog/2007/08/19/login-automatically-with-ssh-keys/). It also
includes a script that can do all the work for you.

### Did It Work?

Open a terminal and type:

```bash
$ ssh server.example.com
```

It should not ask you for any password. **Great!** this means we can also run
rsync directly without logging in! If you need more in depth information on
this, I wrote an article on [logging in automatically with SSH keys](/blog/2007/08/19/login-automatically-with-ssh-keys/).

## Let's create a sync script

So now just create a script `/root/bin/syncdata.bash`

```bash
$ $EDITOR /root/bin/syncdata.bash
```

that contains your rsync command:

```bash
#!/usr/bin/env bash
rsync -az --delete /home/kevin/source/ server.example.com:/home/kevin/destination
```

Save the file and exit and make it executable like this:

```bash
$ chmod +x /root/bin/syncdata.bash
```

## Schedule It to Run Every Hour

And to have your data synchronized every hour, open up your [crontab](/categories/crontab/)
editor:

```bash
$ crontab -e
```

And type

```bash
0 * * * * /root/bin/syncdata.bash
```

(if you need more in depth information on crontab I've written another article
on [scheduling tasks on linux using crontab](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/))

That's it! New files are automatically updated @
`server.example.com:/home/kevin/destination/` every hour. Files that are
deleted from `/home/kevin/source/*` are also deleted at the destination, thanks
to the `--delete` parameter.

## Some Extra rsync Command Line Options

Some extra arguments that might come in handy customizing your synchronization
job:

- `--delete` delete files remotely that no longer exist locally
- `--dry-run` show what would have been transferred, but do not transfer anything
- `--max-delete=10` don't delete more than 10 files in one run, safety precaution
- `--delay-updates` put all updated files into place at transfer's end, very useful for live systems
- `--compress-level=9` explicitly set compression level *9*. 0 disabled compression
- `--exclude-from=/root/sync_exclude` specifies a */root/sync\_exclude* that contains exclude patterns (one per line). filenames matching these patterns will not be tranfered
- `--bwlimit=1024` This option specifies a maximum transfer rate of 1024 kilobytes per second.

## Pitfalls

- Of course you should really be carefull where and when to install SSH
  keys, because if one machine is comprimised, it's very easy for a cracker to
  hop to the next system without logging in. So choose wisely when to use this
  technology. You might consider 'pulling' the files in from the backup machine if that one is less exposed. This way if your main machine gets hacked, they can't hop to your backup machine.
- Keys are user user specific. So if you're going to run programs as root
  that need to automatically login to systems, you must also install the key as
  root.
]]></content:encoded>
      <dc:date>2007-08-16T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Delete Files Securely With Shred</title>
      <link>https://kvz.io/delete-files-securely-with-shred.html</link>
      <description><![CDATA[Deleting a file or reformatting a disk does not destroy your sensitive data.
The data can easily be undeleted. That's a good thing if you accidentally
throw something away, but what if your trying to destroy financial data, bank
account passwords, or classified company information. What if you want to
clean your computer before selling it for instance?
]]></description>
      <pubDate>Sat, 04 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/delete-files-securely-with-shred.html</guid>
      <content:encoded><![CDATA[Deleting a file or reformatting a disk does not destroy your sensitive data.
The data can easily be undeleted. That's a good thing if you accidentally
throw something away, but what if your trying to destroy financial data, bank
account passwords, or classified company information. What if you want to
clean your computer before selling it for instance?

<!--more-->

*In this article you will learn how to use a very powerful tool, so be
careful, because you could totally mess up your system. I currently use
Ubuntu, but this article should work for pretty much any distribution.*

## About Shred

To make sure the data is unrecoverable by anyone, it needs to be overwritten.
Ubuntu has got a standard tool for this called *shred*, you will probably find
it preinstalled on your distribution as well.

The shred command lets you delete files or entire hard drives permanently by
overwriting the data with random gibberish many times (25 by default). This
totally destroys the original data and makes it almost impossible to recover.

## Using Shred

### Shred Files

For shredding files you can run shred like this:

```bash
$ shred -z -u -n200 /home/kevin/company_info/*
```

- `-z` overwrite with zero's the last time, to mask the shred process.
- `-u` means delete when you're done overwriting
- `-n200` means overwrite *200* times

### Shred Drives

Some things that I'm going to change for this operation:

- Overwriting 200 times might take too long when overwriting an entire
  drive, so let's overwrite it 10 times.
- The device itself can't be deleted so the `u` argument has to leave.
- We need to replace the `/home/kevin/company_info/` with your device name,
  which you could look up by typing `df`.
- You can always concatenate arguments so I'm going to do that as well.

So now the command could look something like this:

```bash
$ shred -zn10 /dev/hda
```

This will totally erase everything on your hard drive. It's best to do this
from a Live CD, otherwise during the first run, it will begin missing some
essential system files (that of course are being overwritten) and you don't
want to crash the system before totally destroying all data.

## Final Remarks

Shred works best on an entire disk because there are journaling filesystems
that store duplicate bytes on others places on the disk delete it.
]]></content:encoded>
      <dc:date>2007-08-04T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Restore Packages Using dselect-upgrade</title>
      <link>https://kvz.io/restore-packages-using-dselectupgrade.html</link>
      <description><![CDATA[It's always a good idea to backup important data. Your files and settings can
easily be archived. But how can you backup &amp; restore all applications
that you've installed over the last couple of years? Here's an easy trick that
works for both desktops &amp; servers, and that can also be used to synchronize
installed packages in a web cluster, making all the servers run the same
software.
]]></description>
      <pubDate>Fri, 03 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/restore-packages-using-dselectupgrade.html</guid>
      <content:encoded><![CDATA[It's always a good idea to backup important data. Your files and settings can
easily be archived. But how can you [backup](/categories/backup/) & restore all applications
that you've installed over the last couple of years? Here's an easy trick that
works for both desktops & servers, and that can also be used to synchronize
installed packages in a web cluster, making all the servers run the same
software.

<!--more-->

*The method described in this article depends on the command apt-get, so it
works on Debian & [Ubuntu](/categories/ubuntu/) systems.\_\_ This article does not describe a full
backup & restore method, it's a trick to add to your existing backup
procedure. Still, it's a trick that will really make your life easier.*

## APT Packages

The basic idea is that we generate a list of all currently installed packages,
keep it some place safe, and upon a reinstall, we can upload this list again
and have the system install all the packages in this list automatically.

### How to Backup

So first we need to create a list of all the installed APT packages and save
it in a file:

```bash
$ sudo dpkg --get-selections > /tmp/dpkglist.txt
```

That's it! The list is now stored in `/tmp/dpkglist.txt`. If you want you can
[add this command to your crontab](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/) and then just include the file
`/tmp/dpkglist.txt` in your backup procedure so that it's safe and up to date
at all times.

### How to Restore

Now if your system crashes (let's all hope it won't) and you need to
reinstall, this will be the procedure:

- install a fresh OS (of course)
- restore the package list
- restore your important files & settings

But how can can we restore the package list? Simple. Just copy your backed up
`dpkglist.txt` file to your fresh system's `/tmp` directory again and execute
the following:

```bash
$ sudo dpkg --set-selections < /tmp/dpkglist.txt
$ sudo apt-get -y update
$ sudo apt-get dselect-upgrade
```

**Great!** All of your apt packages have been restored!

**(Don't worry! This method only adds** and upgrades packages, it *will not remove* packages that do not exist in the list)

## Additional Trick: PEAR Packages (Web Servers Only)

The same method can be used to restore [PEAR](/categories/pear/) extensions. Though there
aren't any standard tools that I know of, with a little creativity it's not so
hard.

### How to Backup

This will generate a list of all installed PEAR packages and save it to a
file:

```bash
$ sudo pear -q list | egrep 'alpha|beta|stable' |awk '{print $1}' > /tmp/pearlist.txt
```

That's it! A list of your installed PEAR packages is stored in the file
`/tmp/pearlist.txt`. Now, if you want you can [add this command to your
crontab](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/) and then just include the file `/tmp/pearlist.txt` in your backup
procedure so that it's safe and up to date at all times.

### How to Restore

To restore: make sure PEAR is installed, simply copy the `pearlist.txt` file
back to your new system's `/tmp` directory and type:

```bash
cat /tmp/pearlist.txt |awk '{print "pear install -f "$0}' |sudo bash
```

**Great!** All of your PEAR packages have been restored!
]]></content:encoded>
      <dc:date>2007-08-03T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Survive Heavy Traffic With Your Webserver</title>
      <link>https://kvz.io/survive-heavy-traffic-with-your-webserver.html</link>
      <description><![CDATA[Recently two of my articles reached the Digg frontpage at the same day. My web
server isn't state of the art and it had to handle gigantic amounts of
traffic. But still it served pages to visitors swiftly thanks to a lot of
optimizations. This is how you can prevent heavy traffic from killing your
server.
]]></description>
      <pubDate>Wed, 01 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/survive-heavy-traffic-with-your-webserver.html</guid>
      <content:encoded><![CDATA[Recently two of my articles reached the Digg frontpage at the same day. My web
server isn't state of the art and it had to handle gigantic amounts of
traffic. But still it served pages to visitors swiftly thanks to a lot of
optimizations. This is how you can prevent heavy traffic from killing your
server.

<!--more-->

## About This Article

There are many things you can do to speed up your website. This article
focuses on practical things that I used, without any spending money on
additional hardware or commercial software.

In this article I assume that you're already familiar with system
administration and hosting / creating websites. In examples I use Ubuntu, but
if you use another distro, just make some minor adjustments (like package
management) and it should work as well.

Beware, if you don't know what you're doing you could seriously mess up your
system.

## Cache PHP Output

Every time a request hits your server, [PHP](/categories/php/) has to do a lot of processing,
all of your code has to be compiled & executed for every single visit. Even
though the outcome of all this processing is identical for both visitor 21600
and 21601. So why not save the flat HTML generated for visitor 21600, and
serve that to 21601 as well? This will relieve resources of your web server
*and* database server because less PHP often means less database queries.

Now you could write such a system yourself but there's a neat package in
[PEAR](/categories/pear/) called [Cache\_Lite](https://pear.php.net/manual/en/package.caching.cache-lite.intro.php) that can do this for us, benefits:

- it saves us the time of inventing the wheel
- it's been thoroughly tested
- it's easy to implement
- it's got some cool features like lifetime, read/write control, etc.

Installing is like taking candy from a baby. On Ubuntu I would:

```bash
$ sudo aptitude install php-pear
$ sudo pear install Cache_Lite
```

**And we're ready to use one of our most important assets!**

To learn exactly how to implement Cache\_Lite into your code I've written
another article called: [Speedup your website with Cache\_Lite](/blog/2007/08/01/speedup-your-website-with-cache-lite/).

## Create Turbo Charged Storage

With the PHP [caching](/categories/caching/) mechanism in place, we take away a lot of stress
from your CPU & RAM, but not from your disk. This can be solved by creating a
storage device with your system's RAM, like this:

```bash
$ mkdir -p /var/www/www.mysite.com/ramdrive
$ mount -t tmpfs -o size=500M,mode=0744 tmpfs /var/www/www.mysite.com/ramdrive
```

Now the directory `/var/www/www.mysite.com/ramdrive` is not located on your
disk, but in your system's memory. And that's about 30 times faster :) So why
not store your PHP cache files in this directory? You could even copy all
static files (images, css, js) to this device to minimize [disk IO](/categories/io/). Two
things to remember:

- All files in your ramdrive are lost on reboot, so create a script to
  restore files from disk to RAM
- The ramdrive itself is lost on reboot, but you can add an entry to
  `/etc/fstab` to prevent that

To learn exactly how to tackle te above, I've written another article called:
[Create turbocharged storage using tmpfs](/blog/2007/07/18/create-turbocharged-storage-using-tmpfs/).

## Leave Heavy Processing to Cronjobs

For example. I count the number of visits for every singe article. But instead
of updating a counter for an article every visit (which involves row locking
and a WHERE statement), I use simple and relativity performance-cheap SQL
INSERTS into a separate table.

The gathered data is processed every 5 minutes by a separate PHP script that's
automatically run by my server. It counts the hits per article, then deletes
the gathered data and updates the grand totals in a separate field in my
article table. So finally accessing the hit count of an article takes no extra
processing time or heavy queries.

If you want more in depth information on writing cronjobs, I've written
another article called: [Schedule tasks on Linux using crontab](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/).

## Optimize Your Database

### Use the InnoDB Storage Engine

If you use [MySQL](/categories/mysql/), the default storage engine for tables is MyISAM. That
not ideal for a high traffic website because MyISAM uses table level locking,
which means during an UPDATE, nobody can access any other record of the same
table. It puts everyone on hold!

InnoDB however, uses Row level locking. Row level locking ensures that during
an UPDATE, nobody can access that particular row, until the locking
transaction issues a COMMIT.

phpmyadmin allows you to easily change the table type in the *Operations* tab.
Though it never caused me any problems, it's wise to first create a backup of
the table you're going to ALTER.

### Use Optimal Field Types

Wherever you can, make integer fields as small as possible. Nnot by changing
the length but by changing it's actual integer type. The length is only used
padding.

So if you don't need negative numbers in a column, always make a field
*unsigned*. That way you can store maximum values with minimum space (bytes).
Also make sure foreign keys have matching field types, and place indexes on
them. This will greatly speedup queries.

In phpmyadmin there's a link *Propose Table Structure*. Take a look sometime,
it will try to tell you what fields can be optimized for your specific db
layout.

### Queries

Never select more fields than strictly necessary. Sometimes when you're lazy
you might do a:

```sql
SELECT * FROM `blog_posts`
```

even though a

```sql
SELECT `blog_post_id`,`title` FROM `blog_posts`
```

would suffice. Normally that's OK, but not when performance is your no.1
priority.

### Tweak the MySQL Config

Furthermore there are quite some things you can do to the `my.cnf` file, but
I'll save that for another article as it's a bit out of this article's scope.

## Save Some Bandwidth

### Save Some Sockets First

Small optimizations make for big bandwidth savings when volumes are high. If
traffic is a big issue, or you really need that extra server capacity, you
could throw all CSS code into one big `.css` file. Do this with the JS code as
well. This will save you some [Apache](/categories/apache/) sockets that other visitors can use
for their requests. It will also give you better compression rations, should
you choose to mod\_deflate or compress your javascript with [Dean Edwards
Packer](https://dean.edwards.name/packer/).

I know what your thinking. No, don't throw all the CSS and JS in the main
page. You still really want this separation to:

- make use of the visitor's browser cache. Once they've got your CSS, it
  won't be downloaded again
- not pollute your HTML with that stuff

### And Now Some Bandwidth ; )

- Limit the number of images on your site
- Compress your images
- Eliminate unnecessary whitespace or even compress JS with tools available
  everywhere.
- [Apache can compress the output before it's sent back to the client
  through mod\_deflate](/blog/2008/03/29/better-performance-with-mod-deflate/). This results in a smaller page being sent over the
  Internet at the expense of CPU cycles on the Web server. For those servers
  that can afford the CPU overhead, this is an excellent way of saving
  bandwidth. But I would turn all compression off to save some extra CPU cycles.

## Store PHP Sessions in Your Database

If you use PHP sessions to keep track of your logged in users, then you may
want to have a look at PHP's function: [session\_set\_save\_handler](https://nl3.php.net/manual/en/function.session-set-save-handler.php#60316). With
this function you can overrule PHP's session handling system with you own
class, and store sessions in a database table or in Memcached.

Now a key attribute to success, is to make this table's storage engine: MEMORY
(also known as HEAP). This stores all session information (should be tiny
variables) in the database server's RAM. Taking away disk IO stress from your
web server, plus allowing to share the sessions with multiple web servers in
the future, so that if you're logged in on server A, you're also logged in on
server B, making it possible to load balance.

### Sessions on tmpfs

If it's too much of a hassle to store sessions in a MEMORY database, storing
session files on a ramdisk is also a good options to gain some performance.
Just make the `/var/lib/php5` live in RAM. To learn exactly how to do this,
I've written another article called: [Create turbocharged storage using tmpfs
»](/blog/2007/07/18/create-turbocharged-storage-using-tmpfs/).

### Sessions in Memcached

I recently (22th June, 08) found another (better) way to store sessions in a
cluster-proof, resource-cheap way and dedicated a separate article on it
called: [Enhance PHP session management](/blog/2008/06/22/enhance-php-session-management/).

## More Tips

Some other things to google on if you want even more:

- eAccelerator
- memcached
- tweak the apache config
- squid
- turn off apache logging
- Add 'noatime' in /etc/fstab on your web and data drives to prevent disk
  writes on every read
]]></content:encoded>
      <dc:date>2007-08-01T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Speedup Your Website With Cache_Lite</title>
      <link>https://kvz.io/speedup-your-website-with-cache-lite.html</link>
      <description><![CDATA[Every time a request hits your server, PHP has to do a lot of processing,
all of your code has to be compiled &amp; executed for every single visit. Even
though the outcome of all this processing is often identical for both visitor
21600 and 21601. So why not save the flat HTML generated for visitor 21600,
and serve that to 21601 as well? This will relieve resources of your web
server and database server because less PHP often means less queries.
]]></description>
      <pubDate>Wed, 01 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/speedup-your-website-with-cache-lite.html</guid>
      <content:encoded><![CDATA[Every time a request hits your server, [PHP](/categories/php/) has to do a lot of processing,
all of your code has to be compiled & executed for every single visit. Even
though the outcome of all this processing is often identical for both visitor
21600 and 21601. So why not save the flat HTML generated for visitor 21600,
and serve that to 21601 as well? This will relieve resources of your web
server and database server because less PHP often means less queries.

<!--more-->

## Cache\_Lite

Now you could write such a [performance](/categories/performance/) system yourself but there's a neat
package in [PEAR](/categories/pear/) called Cache\_Lite that can do this for us, benefits:

- it saves us the time of inventing the wheel
- it's been thoroughly tested
- it's easy to implement
- it's got some cool features like lifetime, read/write control, etc.

Now you don't need to [cache](/categories/caching/) entire pages, you can also just cache parts
of pages if that's easier to implement.

## Install Cache\_Lite

Installing is like taking candy from a baby. On Ubuntu I would use aptitude
like this:

```bash
$ sudo aptitude -y update
$ sudo aptitude install php-pear
```

And now that we have PEAR, i would use it to install the Cache\_Lite extension:

```bash
$ sudo pear install Cache_Lite
```

**And we're ready!**

## Implement Cache\_Lite

So let's see how we can implement Cache\_Lite into our scripts. The basic idea
is that we have a unique identifier for every page. You can make it up, get it
from your database, or use the `REQUEST_URI`. Cache\_Lite will see if it stored
content for that identifier before, and if it's fresh enough. If so, it will
retrieve the stored HTML from disk and echo it right away. If not, we:

- turn on output buffereing so we can catch all following content
- we include the original PHP code
- catch the output buffer, and let Cache\_Lite store  it on disk for the next
  time.
- and then echo it

This is a PHP example:

```php
<?php
/* Include the class */
require_once 'Cache/Lite.php';

/* Set a key for this cache item */
$id = 'newsitem1';
/* Set a few options */
$options = array(
    'cacheDir' => '/var/www/www.mywebsite.com/cache/',
    'lifeTime' => 3600
);

/* Create a Cache_Lite object */
$Cache_Lite = new Cache_Lite($options);
/* Test if there is a valid cache-entry for this key */
if ($data = $Cache_Lite->get($id)) {
    /* Cache hit! We've got the cached content stored in $data! */
} else {
    /* Cache miss! Use ob_start to catch all the output that comes next*/
    ob_start();

    /* The original content, which is now saved in the output buffer */
    include "realcontent.php";
    /* We've got fresh content stored in $data! */
    $data = ob_get_contents();

    /* Let's store our fresh content, so next
     * time we won't have to generate it! */
    $Cache_Lite->save($data, $id);
    ob_get_clean();
}
echo $data;
?>
```

In this example, the real, original php code is stored in `realcontent.php`

## Let Cache Live in RAM

If your want, you can have all the static html files served from the server's
internal memory. Now this would really speedup things. Checkout my other
article [Create turbocharged storage using tmpfs](/blog/2007/07/18/create-turbocharged-storage-using-tmpfs/) to learn how.

## More Cache\_Lite

One thing I always like to do, is to automatically purge an article's cache
when a comment has been placed. You could for example place this before
Cache\_Lite checks if it's got a cache page for a specific $id:

```php
<?php
if (isset($_POST["add_comment"]) && $_POST["add_comment"]){
    $Cache_Lite->remove($id);
}
?>
```

Take some time to read the comments in the source code, it's actually pretty
easy.
]]></content:encoded>
      <dc:date>2007-08-01T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Make ISO Images on Linux</title>
      <link>https://kvz.io/make-iso-images-on-linux.html</link>
      <description><![CDATA[CDs and DVDs don't have the eternal life, so you might want to back them up as
ISO images. All the files and properties of the original disc, stored in
a single file. You can also create ISO images and store them on your network
for easy distribution of software installations. Here's how to create and
mount ISO images on Linux.
]]></description>
      <pubDate>Wed, 01 Aug 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/make-iso-images-on-linux.html</guid>
      <content:encoded><![CDATA[CDs and DVDs don't have the eternal life, so you might want to back them up as
[ISO images](/categories/iso/). All the files and properties of the original disc, stored in
a single file. You can also create ISO images and store them on your network
for easy distribution of software installations. Here's how to create and
mount ISO images on [Linux](/categories/linux/).

<!--more-->

## Graphical Utilities

Of course you can always install and use graphical disc authoring software
like GnomeBaker or K3b, but that's outside the scope of this this article. I
just want to show you how you can quickly create an ISO image without
installing additional software.

## Command Line

We're going to use the command line tool [dd](/categories/dd/) tool for this. Insert the
disc that you want to copy and open a terminal.

### Create a Cdrom Image

Now in the terminal type:

```bash
$ sudo dd if=/dev/cdrom of=cd.iso
```

A little explanation

- `sudo` makes sure the command is executed as root. That's needed only if
  the user you're working under doesn't have enough permissions to access the
  device. But it's ignored if it's not needed so you can just ignore it as well.
- `dd` stands for Disk Dump
- `if` stands for Input File
- `of` stands for Output File

Wait for the command to finish, and your new iso will be saved to `cd.iso`.

### Create a dvd Image

For a DVD image, your device is probably called `/dev/dvd` instead of
`/dev/cdrom` so the command would look like this:

```bash
$ sudo dd if=/dev/dvd of=dvd.iso
```

### Create a scsi cdrom image

For a SCSI CDROM image, your device is probably called `/dev/scd0` instead of
`/dev/cdrom` so the command would look like this:

```bash
$ sudo dd if=/dev/scd0 of=cd.iso
```

## Mounting an Image

Once you've created an ISO image you can mount it as if it was a normal disc
device (loopback) device. This will give you access to the files in the ISO
without you having to burn it to a disc first. For example if you wanted to
mount `cd.iso` to `/mnt/isoimage` you would run the following commands:

```bash
$ mkdir -p /mnt/isoimage
$ mount -o loop -t iso9660 cd.iso /mnt/isoimage
```

### Unmounting

To unmount a currently mounted volume, type:

```bash
$ umount -lf /mnt/isoimage
```

`/mnt/isoimage` is the location of your mounted volume.
]]></content:encoded>
      <dc:date>2007-08-01T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Schedule Tasks on Linux Using Crontab</title>
      <link>https://kvz.io/schedule-tasks-on-linux-using-crontab.html</link>
      <description><![CDATA[If you've got a website that's heavy on your web server, you might want to run
some processes like generating thumbnails or enriching data in the background.
This way it can not interfere with the user interface. Linux has a great
program for this called cron. It allows tasks to be automatically run in the
background at regular intervals. You could also use it to automatically create
backups, synchronize files, schedule updates, and much more. Welcome to
the wonderful world of crontab.
]]></description>
      <pubDate>Sun, 29 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/schedule-tasks-on-linux-using-crontab.html</guid>
      <content:encoded><![CDATA[If you've got a website that's heavy on your web server, you might want to run
some processes like generating thumbnails or enriching data in the background.
This way it can not interfere with the user interface. [Linux](/categories/linux/) has a great
program for this called cron. It allows tasks to be automatically run in the
background at regular intervals. You could also use it to automatically create
backups, synchronize files, [schedule updates](/blog/2007/07/29/schedule-automatic-updates-on-ubuntu/), and much more. Welcome to
the wonderful world of [crontab](/categories/crontab/).

<!--more-->

## Crontab

The crontab (cron derives from *chronos*, Greek for time; tab stands for
*table*) command, found in Unix and Unix-like operating systems, is used to
schedule commands to be executed periodically. To see what crontabs are
currently running on your system, you can open a terminal and run:

```bash
$ sudo crontab -l
```

To edit the list of *cronjobs* you can run:

```bash
$ sudo crontab -e
```

This wil open a the default editor (could be vi or pico, if you want you can
[change the default editor](/blog/2007/07/11/change-the-default-editor/)) to let us manipulate the crontab. If you save
and exit the editor, all your cronjobs are saved into crontab. Cronjobs are
written in the following format:

```bash
* * * * * /bin/execute/this/script.sh
```

## Scheduling explained

As you can see there are 5 stars. The stars represent different date parts in
the following order:

- minute (from 0 to 59)
- hour (from 0 to 23)
- day of month (from 1 to 31)
- month (from 1 to 12)
- day of week (from 0 to 6) (0=Sunday)

### Execute every minute

If you leave the star, or asterisk, it means **every**. Maybe that's a bit
unclear. Let's use the the previous example again:

```bash
* * * * * /bin/execute/this/script.sh
```

They are all still asterisks! So this means execute
`/bin/execute/this/script.sh`:

- **every** minute
- of **every** hour
- of **every** day of the month
- of **every** month
- and **every** day in the week.

In short: This script is being executed every minute. Without exception.

### Execute every Friday 1AM

So if we want to schedule the script to run at 1AM every Friday, we would need
the following cronjob:

```bash
0 1 * * 5 /bin/execute/this/script.sh
```

Get it? The script is now being executed when the system clock hits:

- minute: `0`
- of hour: `1`
- of day of month: `*` (every day of month)
- of month: `*` (every month)
- and weekday: `5` (=Friday)

### Execute on workdays 1AM

So if we want to schedule the script to Monday till Friday at 1 AM, we would
need the following cronjob:

```bash
0 1 * * 1-5 /bin/execute/this/script.sh
```

Get it? The script is now being executed when the system clock hits:

- minute: `0`
- of hour: `1`
- of day of month: `*` (every day of month)
- of month: `*` (every month)
- and weekday: `1-5` (=Monday til Friday)

### Execute 10 past after every hour on the 1st of every month

Here's another one, just for practicing

```bash
10 * 1 * * /bin/execute/this/script.sh
```

Fair enough, it takes some getting used to, but it offers great flexibility.

## Neat scheduling tricks

What if you'd want to run something every 10 minutes? Well you could do this:

```bash
0,10,20,30,40,50 * * * * /bin/execute/this/script.sh
```

But crontab allows you to do this as well:

```bash
*/10 * * * * /bin/execute/this/script.sh
```

Which will do exactly the same. Can you do the the math? ; )

## Special words

For the first (minute) field, you can also put in a keyword instead of
a number:

```bash
@reboot     Run once, at startup
@yearly     Run once  a year     "0 0 1 1 *"
@annually   (same as  @yearly)
@monthly    Run once  a month    "0 0 1 * *"
@weekly     Run once  a week     "0 0 * * 0"
@daily      Run once  a day      "0 0 * * *"
@midnight   (same as  @daily)
@hourly     Run once  an hour    "0 * * * *"
```

Leaving the rest of the fields empty, this would be valid:

```bash
@daily /bin/execute/this/script.sh
```

## Storing the crontab output

By default cron saves the output of `/bin/execute/this/script.sh` in the
user's mailbox (root in this case). But it's prettier if the output is saved
in a separate logfile. Here's how:

```bash
*/10 * * * * /bin/execute/this/script.sh >> /var/log/script_output.log 2>&1
```

### Explained

Linux can report on different levels. There's standard output (STDOUT) and
standard errors (STDERR). STDOUT is marked 1, STDERR is marked 2. So the
following statement tells Linux to store STDERR in STDOUT as well, creating
one datastream for messages & errors:

```bash
2>&1
```

Now that we have 1 output stream, we can pour it into a file. Where `>` will
overwrite the file, `>>` will append to the file. In this case we'd like to
to append:

```bash
>> /var/log/script_output.log
```

## Mailing the crontab output

By default cron saves the output in the user's mailbox (root in this case) on
the local system. But you can also configure crontab to forward all output to
a real email address by starting your crontab with the following line:

```bash
MAILTO="yourname@yourdomain.com"
```

### Mailing the crontab output of just one cronjob

If you'd rather receive only one cronjob's output in your mail, make sure this
package is installed:

```bash
$ aptitude install mailx
```

And change the cronjob like this:

```bash
*/10 * * * * /bin/execute/this/script.sh 2>&1 | mail -s "Cronjob ouput" yourname@yourdomain.com
```

## Trashing the crontab output

Now that's easy:

```bash
*/10 * * * * /bin/execute/this/script.sh > /dev/null 2>&1
```

Just pipe all the output to the null device, also known as the black hole. On
Unix-like operating systems, `/dev/null` is a special file that discards all
data written to it.

## Caveats

Many scripts are tested in a Bash environment with the `PATH` variable
set. This way it's possible your scripts work in your shell, but when
run from cron (where the `PATH` variable is different), the script
cannot find referenced executables, and fails.

It's not the job of the script to set `PATH`, it's the responsibility of
the caller, so it can help to `echo $PATH`, and put `PATH=<the result>`
at the top of your cron files (right below `MAILTO`).
]]></content:encoded>
      <dc:date>2007-07-29T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Schedule Automatic Updates on Ubuntu</title>
      <link>https://kvz.io/schedule-automatic-updates-on-ubuntu.html</link>
      <description><![CDATA[Making sure your system is up to date is a key attribute to it's security.
Furthermore Ubuntu releases updates pretty often and you probably don't want
to miss out on added stability and features. You could run updated manually,
but why not schedule the updates in the background to make sure you are always
running the latest stable versions, without ever having to worry about it.
]]></description>
      <pubDate>Sun, 29 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/schedule-automatic-updates-on-ubuntu.html</guid>
      <content:encoded><![CDATA[Making sure your system is up to date is a key attribute to it's security.
Furthermore Ubuntu releases updates pretty often and you probably don't want
to miss out on added stability and features. You could run updated manually,
but why not schedule the updates in the background to make sure you are always
running the latest stable versions, without ever having to worry about it.

<!--more-->

## Update

This article was written before Ubuntu's [unattended-upgrades](https://help.ubuntu.com/12.04/serverguide/automatic-updates.html) existed. Consider using that instead.

## Crontab

The crontab command, found in Unix and Unix-like operating systems, is used to
schedule commands to be executed periodically. To see what crontabs are
currently running on your system, you can open a terminal and run:

```bash
$ sudo crontab -l
```

To edit the list of *cronjobs* you can run:

```bash
$ sudo crontab -e
```

This wil open a the default editor (could be `vi` or `nano`, if you want you can
[change the default editor](/blog/2007/07/11/change-the-default-editor/)) to let us manipulate the crontab.
If you save and exit the editor, all your cronjobs are saved into crontab. Cronjobs are written in the following format:

```bash
* * * * * /bin/execute/this/script.sh
```

If you want to know more about crontab, I've written another article:
[Schedule tasks on Linux using crontab](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/)

## Updating With Aptitude

I always used `apt-get` to update systems but I found out that `aptitude` has
better dependency solving capabilities. So lets also use aptitude for this, it
comes preinstalled. Normally I would run something like this from a terminal:

```bash
$ aptitude update # gets information on the latest packages
$ aptitude dist-upgrade # upgrades every package (kernel too)
```

### Making It Cron-Ready

We need to make some adjustments to the aptitude command to make it suitable
to run in the background:

- It should not have to wait on user confirmation, because it isn't getting
  any ; )
- It should not automatically update kernels (this is still something you
  should do manually)
- It should log to a file so you can keep track of it  li>
- It should not proceed with an \`upgrade* if the *update\* failed
- It should be prefixed with a full path. Because cron often works without
  environment variables

The following command takes on all of these above challenges, in just one
line:

```bash
(/usr/bin/aptitude -y update && /usr/bin/aptitude -y safe-upgrade) 2>&1 >> /var/log/auto_update.log
```

### Explained

- `-y` answers yes to all questions so that takes care of the user
  confirmation
- changing `dist-upgrade` to `safe-upgrade` will skip kernel updates
- `2>&1 >> /var/log/auto_update.log` forwards all messages (errors (2), and
  standard (1)) to a logfile
- `&&` links two commands together, but will not execute the second if the
  first one failed.

## Combined: An Aptitude Cronjob

We'll link everything together now. Open your crontab editor:

```bash
$ sudo crontab -e
```

And to execute our upgrade every night at 1AM type:

```bash
0 1 * * * (/usr/bin/aptitude -y update && /usr/bin/aptitude -y safe-upgrade) 2>&1 >> /var/log/auto_update.log
```

Save and exit your editor, and you are all set! You could check the logfile:
`/var/log/auto_update.log` every once in a while to see if everything is still
running smoothly.
]]></content:encoded>
      <dc:date>2007-07-29T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Block Brute Force Attacks With Iptables</title>
      <link>https://kvz.io/block-brute-force-attacks-with-iptables.html</link>
      <description><![CDATA[Since 2005 there has been an immense increase in brute force SSH attacks
and though Linux is pretty secure by default, it does not stop evil
programs from indefinitely trying to login with different passwords. Without
proper protection your server is a sitting duck waiting for a bot to guess the
right combination and hit the jackpot. But with just 2 commands we can stop
that.
]]></description>
      <pubDate>Sat, 28 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/block-brute-force-attacks-with-iptables.html</guid>
      <content:encoded><![CDATA[Since 2005 there has been an immense increase in brute force [SSH](/categories/ssh/) attacks
and though [Linux](/categories/linux/) is pretty [secure](/categories/security/) by default, it does not stop evil
programs from indefinitely trying to login with different passwords. Without
proper protection your server is a sitting duck waiting for a bot to guess the
right combination and hit the jackpot. But with just 2 commands we can stop
that.

<!--more-->

## Symptoms

Here's an example of the `auth.log` file. You can see that even as I'm writing
this article bots are trying different account combinations to get into my
server:

```bash
Jul 28 21:32:16 impala sshd[10855]: Illegal user office from 213.191.74.219
Jul 28 21:32:16 impala sshd[10855]: Failed password for illegal user office from 213.191.74.219 port 53033 ssh2
Jul 28 21:32:16 impala sshd[10857]: Illegal user samba from 213.191.74.219
Jul 28 21:32:16 impala sshd[10857]: Failed password for illegal user samba from 213.191.74.219 port 53712 ssh2
Jul 28 21:32:16 impala sshd[10859]: Illegal user tomcat from 213.191.74.219
Jul 28 21:32:16 impala sshd[10859]: Failed password for illegal user tomcat from 213.191.74.219 port 54393 ssh2
Jul 28 21:32:16 impala sshd[10861]: Illegal user webadmin from 213.191.74.219
Jul 28 21:32:16 impala sshd[10861]: Failed password for illegal user webadmin from 213.191.74.219 port 55099 ssh2
```

Do you see the rate at which this is happening? Nowadays' connection speeds
allow for crackers to try an enormous amount of combinations every second!
It's time to stop this before someone hits the jackpot and my server is
compromised.

## About iptables

[Iptables](/categories/iptables/) is the standard Linux [firewall](/categories/firewall/) and though I use
[Ubuntu](/categories/ubuntu/), it should be installed by default on any modern distribution. But
it doesn't do anything yet. It's just sitting there, so we need to teach it
some rules to prevent brute force attacks.

There are tools available to do this for us like *fail2ban*. Though it's a
great piece of software and certainly has it's advantages, in this article I'd
like to stick with iptables because fail2ban parses log files to detect brute
force attacks at a certain interval, whereas iptables works directly on the
kernel level. Besides I don't think many people know about iptables' full
capabilities, and it comes preinstalled!

## Easy Setup - Just 2 Rules

Because iptables comes standard with every Linux distribution we'll skip right
to setting up the specific firewall rules we need. In depth configuring of
iptables takes a bit of understanding and is not within the scope of this
article, but let's take a look at these two statements:

```bash
$ sudo iptables -A INPUT -i eth0 -p tcp --dport 22 -m state --state NEW -m recent --set --name SSH
$ sudo iptables -A INPUT -i eth0 -p tcp --dport 22 -m state --state NEW -m recent --update --seconds 60 --hitcount 8 --rttl --name SSH -j DROP
```

The `-i _eth0` is the network interface to which ssh connections are made.
Typically this is eth0, but maybe you need to change it.

**That's it!** Together they will rate-limit all incoming SSH connections to 8
in a one minute window. Normal users will have no trouble logging in, but the
brute force attacks will be dropped, limiting the number of possible account
combinations from unlimited, to 8. **That's awesome!**

### Failsafe

While you're still testing, you might want to
[add the following line to your crontab](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/)

```bash
*/10 * * * * /sbin/iptables -F
```

This will flush all the rules every 10 minutes, just in case you lock yourself
out. When you're happy with the results of your work, remove the line from
your crontab, and you're in business.

## Advanced Setup - Want More?

### Restore on Boot

You will find that on your next reboot, the rules are lost. Damn! You probably
want these 2 brute force protection rules automatically restored, right? The
most elegant way would probably be to restore the iptables rules when your
network interface comes back online. Here how I would this on Ubuntu. Let's
get the following content in a file: `/etc/network/if-up.d/bfa_protection`

```bash
#!/usr/bin/env bash
[ "${METHOD}" != loopback ] || exit 0
/sbin/iptables -A INPUT -i _eth0_ -p tcp --dport 22 -m state --state NEW -m recent --set --name SSH
/sbin/iptables -A INPUT -i _eth0_ -p tcp --dport 22 -m state --state NEW -m recent --update --seconds 60 --hitcount 8 --rttl --name SSH -j DROP
```

Save the file and make it executable:

```bash
$ chmod a+x /etc/network/if-up.d/bfa_protection
```

Now every time your interface comes up, the rules are added to iptables.
Sweet.

### Remove on Shutdown

But to do this really clean, we need to have a script that removes the rules
as well for when the interface goes down. Just to make sure the rules are
never added twice. So let's also create a file: `/etc/network/if-down.d/bfa_protection`

```bash
#!/usr/bin/env bash
[ "${METHOD}" != loopback ] || exit 0
/sbin/iptables -D INPUT -i _eth0_ -p tcp --dport 22 -m state --state NEW -m recent --set --name SSH
/sbin/iptables -D INPUT -i _eth0_ -p tcp --dport 22 -m state --state NEW -m recent --update --seconds 60 --hitcount 8 --rttl --name SSH -j DROP
```

`-D` removes a rule whereas `-A` adds one. Anyway. Let's save this file and
make it executable:

```bash
$ chmod a+x /etc/network/if-down.d/bfa_protection
```

**That's it!** We're in business!

### Like to Test It?

Very wise indeed, well `iptables -L` shows active rules so why not execute the
following:

```bash
$ /etc/network/if-up.d/bfa_protection
$ iptables -L
```

**Perfect.** If you have another machine (not the one you're working on! you
do not want to take the risk of getting banned yourself!) you could really
test it by logging 8 times within 60 seconds. See if you get banned!

Now does the removal script work as well?

```bash
$ /etc/network/if-down.d/bfa_protection
$ iptables -L
```

Now the rules should be gone.

### Undo

And oh yes, if at any time you run into problems, the following command will
flush all the iptables rules:

```bash
$ iptables -F
```

And you can undo by just removing the files we created:

```bash
$ rm /etc/network/if-up.d/bfa_protection
$ rm /etc/network/if-down.d/bfa_protection
$ iptables -F # flush all the rules, just in case
```

## More on Iptables

This is just one nice example of what you can do with the iptables firewall
but there are many other uses for iptables in order to secure your system.
There are scripts / wizards that will help you setup iptable rules like
ksecure\_firwall (a bash script by myself), or more widely used programs
like fwbuilder or firestarter (both available through package managament like
apt).

If you'd like to know more about iptables, [this is a place to start](https://help.ubuntu.com/community/IptablesHowTo), or
you could [just google of course](https://www.google.com/search?hl=en&client=firefox-a&rls=com.ubuntu%3Aen-US%3Aofficial&hs=ePm&q=simple+iptables+tutorial&btnG=Search).
]]></content:encoded>
      <dc:date>2007-07-28T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Make SSH Connections With PHP</title>
      <link>https://kvz.io/make-ssh-connections-with-php.html</link>
      <description><![CDATA[Not everyone knows about PHP's capabilities of making SSH connections and executing remote commands, but it can be very useful. I've been using it a lot in PHP CLI applications that I run from cronjobs, but initially it was a pain to get it to work. The PHP manual on Secure Shell2 Functions is not very practicle or thorough for that matter, so I would like to share my knowledge in this how to, to make it a little less time consuming setting this up.
]]></description>
      <pubDate>Tue, 24 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/make-ssh-connections-with-php.html</guid>
      <content:encoded><![CDATA[![ssh php](/assets/images/posts/sshp1.png "ssh php")Not everyone knows about [PHP](/categories/php/)'s capabilities of making [SSH](/categories/ssh/) connections and executing remote commands, but it can be very useful. I've been using it a lot in PHP [CLI ](/categories/cli/)applications that I run from cronjobs, but initially it was a pain to get it to work. The [PHP manual on Secure Shell2 Functions](https://www.php.net/manual/en/ref.ssh2.php) is not very practicle or thorough for that matter, so I would like to share my knowledge in this how to, to make it a little less time consuming setting this up.

<!--more-->

In this article I'm going to assume that:

- You're running Debian / [Ubuntu](/categories/ubuntu/)
  If not, you will have to substitute the package manager aptitude with
  whatever your distribution provides
- You're running PHP 5
  If not, just replace php5 with php4 everywhere
- You have basic knowledge of PHP & server administration
- You already have PHP installed

## Update

On recent Ubuntu machines, there's no need to do any compiling anymore:

```bash
$ aptitude install libssh2-1-dev libssh2-php
```

You can now test if PHP recognizes it's new ssh2 extension by running:

```bash
$ php -m |grep ssh2
```

It should return: `ssh2`

If the above works for you (you should see also: "Build process
completed successfully"), you can skip to: **Great! PHP supports SSH - time to code**.

Otherwise we need to compile manually, continue reading here.

## Prerequisites

### Packages

First let's install the following packages:

```bash
$ aptitude install php5-dev php5-cli php-pear build-essential openssl-dev zlib1g-dev
```

That should set us up alright.

### libssh2

We need [libssh2](/categories/libssh2/) from sourcefourge. We have to compile this,
but no worries, this is all you need to do:

```bash
$ cd /usr/src \
 && wget https://surfnet.dl.sourceforge.net/sourceforge/libssh2/libssh2-0.14.tar.gz \
 && tar -zxvf libssh2-0.14.tar.gz \
 && cd libssh2-0.14/ \
 && ./configure \
 && make all install
```

That's it! Easy right?

- Update: since December 26th 2008, libssh2 has reached version
  1.0. Though I have not tested it: it has been reported to work.
  So you may want to check [sf.net](https://sourceforge.net/projects/libssh2/) and download the latest stable
  version.

## Installation

### ssh2.so

Next we need to link libssh & PHP together. There's a [PECL](/categories/pecl/) module for this
so let's install using:

```bash
$ pecl install -f ssh2
```

The `-f` makes sure ssh2 is installed even though there's not a stable candidate.
You could also use the package name: *ssh2-beta* to overrule this.

Now you need to make sure our new `ssh2.so` module is loaded by PHP. Edit a
`php.ini` file (I'd recommend a separate one: `/etc/php5/conf.d/ssh2.ini`).
Make sure it reads:

```bash
extension=ssh2.so
```

## Great! PHP Supports SSH - Time to Code

You've just enabled ssh2 support in PHP. Now how can we make use of this?
There are 2 options. SSH supports the:

- **Execute method**
  This tells the server's operating system to execute something and pipe
  the output back to your script. (recommended)
- **Shell method**
  This opens an actual shell to the operating system, just as you would
  normally when logging in with your terminal application. Some [routers](/categories/router/)
  that don't have a full POSIX compliant implementation, but run their
  own application as soon as you login, require this. (advanced)

### Method 1: Execute

Best would be to create functions or even a class for the following code, but
this is the basic idea and will definitely get you started:

```php
<?php
if (!function_exists("ssh2_connect")) die("function ssh2_connect doesn't exist");
// log in at server1.example.com on port 22
if(!($con = ssh2_connect("server1.example.com", 22))){
    echo "fail: unable to establish connection\n";
} else {
    // try to authenticate with username root, password secretpassword
    if(!ssh2_auth_password($con, "root", "secretpassword")) {
        echo "fail: unable to authenticate\n";
    } else {
        // allright, we're in!
        echo "okay: logged in...\n";

        // execute a command
        if (!($stream = ssh2_exec($con, "ls -al" ))) {
            echo "fail: unable to execute command\n";
        } else {
            // collect returning data from command
            stream_set_blocking($stream, true);
            $data = "";
            while ($buf = fread($stream,4096)) {
                $data .= $buf;
            }
            fclose($stream);
        }
    }
}
?>
```

### Method 2: Shell

Best would be to create functions or even a class for the following code, but
this is the basic idea and will definitely get you started:

```php
<?php
if (!function_exists("ssh2_connect")) die("function ssh2_connect doesn't exist");
// log in at server1.example.com on port 22
if (!($con = ssh2_connect("server1.example.com", 22))) {
    echo "fail: unable to establish connection\n";
} else {
    // try to authenticate with username root, password secretpassword
    if (!ssh2_auth_password($con, "root", "secretpassword")) {
        echo "fail: unable to authenticate\n";
    } else {
        // allright, we're in!
        echo "okay: logged in...\n";

        // create a shell
        if (!($shell = ssh2_shell($con, 'vt102', null, 80, 40, SSH2_TERM_UNIT_CHARS))) {
            echo "fail: unable to establish shell\n";
        } else {
            stream_set_blocking($shell, true);
            // send a command
            fwrite($shell, "ls -al\n");
            sleep(1);

            // & collect returning data
            $data = "";
            while ($buf = fread($shell,4096)) {
                $data .= $buf;
            }
            fclose($shell);
        }
    }
}
?>
```

## Tips

Sometimes when a server is busy, or a connection is buggy, the buffer may
run dry, and the PHP script stops collecting data from a command
output (even though the command hasn't completed yet!). There are
a couple of things you could do about that:

```php
<?php
ssh2_exec($con, 'ls -al; echo "__COMMAND_FINISHED__"' );
?>
```

Now, in the loop where you keep checking for the buffer, just see
if the COMMAND\_FINISHED line is coming by. Because then you know
you have all the data. To avoid infinite loops, just limit the loop
with a timeout of 10 seconds or so:

```php
<?php
$time_start = time();
$data       = "";
while (true){
    $data .= fread($stream, 4096);
    if (strpos($data,"__COMMAND_FINISHED__") !== false) {
        echo "okay: command finished\n";
        break;
    }
    if ((time()-$time_start) > 10 ) {
        echo "fail: timeout of 10 seconds has been reached\n";
        break;
    }
}
?>
```

In the example above, you'd better set *stream\_set\_blocking* to **false**.

## Can't get enough?

PHP can send files over ssh!

```php
<?php
ssh2_scp_send($con, "/tmp/source.dat", "/tmp/dest.dat", 0644);
?>
```

## Doesn't work?

Check the following:

- Did you follow every step of the prerequisites & installation how
  to in this article?
- **On the serverside, 'PasswordAuthentication yes' must be enabled
  in the sshd\_config.**
  Default is yes on most servers, but in some cases you will have to
  turn this on yourself
  by making sure the following line is in place in the file:
  `/etc/ssh/sshd_config`: `PasswordAuthentication yes`

If you've made any changes, ssh needs a restart

```bash
$ /etc/init.d/ssh restart
```

Post a comment here if it's still failing. Don't forget to paste the
error that you're getting.

### make: \*\*\* [ssh2.lo] Error 1

If you get the error:

```bash
/usr/include/php5/Zend/zend_API.h:361: note: expected char * but argument is of type const unsigned char *
make: *** [ssh2.lo] Error 1
```

that's because of PHP 5.3 incompatibility. Try this [patch](https://pecl.php.net/bugs/bug.php?id=16727):

```bash
$ mkdir -p /usr/src \
 && cd /usr/src \
 && wget https://pecl.php.net/get/ssh2-0.11.0.tgz \
 && tar xvfz ssh2-0.11.0.tgz \
 && cd ssh2-0.11.0 \
 && wget https://remi.fedorapeople.org/ssh2-php53.patch \
 && patch -p0 < ssh2-php53.patch \
 && phpize && ./configure --with-ssh2 \
 && make
```

### make: \*\*\* [ssh2_fopen_wrappers.lo] Error 1

If you get the error:

```bash
/tmp/pear/download/ssh2-0.11.0/ssh2_fopen_wrappers.c:49: error: for each function it appears IN.)
make: *** [ssh2_fopen_wrappers.lo] Error 1
ERROR: `make' failed
```

This is the [reported fix](https://cvs.php.net/viewvc.cgi/pecl/ssh2/ssh2-fopen-wrappers.c?r1=1.15&r2=1.16&diff-format=u) (thanks to BuNker).

## Alternatives

There have been some additional developments since the writing of this
article. Checkout:

- [Net\_SSH2](https://pear.php.net/package/Net-SSH2), PEAR's SSH wrapper (driver based to support multiple
  ways of establishing the connection)
- [SSH2](https://www.seoegghead.com/blog/seo/ssh2-php-howto-guide-ssh-connections-made-easy-in-php-p343.html), another wrapper by Jaimie Sirovich
- [phpseclib](https://phpseclib.sourceforge.net/), a pure PHP implementation - no additional libraries,
  binaries, bindings required (against all odds, still pretty fast with mcrypt
  installed)
]]></content:encoded>
      <dc:date>2007-07-24T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Create Turbocharged Storage Using tmpfs</title>
      <link>https://kvz.io/create-turbocharged-storage-using-tmpfs.html</link>
      <description><![CDATA[Everyone knows that RAM is so much faster than a hard disk. To
illustrate, while a current SATA disk has peak transfer rates of 375
MB/s, current RAM can do a mind blowing 12,500 MB/s! Normally only the system
itself makes use of this ultra fast storage, but we can also access this space
directly. And that opens a great window of opportunity.
]]></description>
      <pubDate>Wed, 18 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/create-turbocharged-storage-using-tmpfs.html</guid>
      <content:encoded><![CDATA[Everyone knows that [RAM](/categories/ram/) is so much faster than a hard disk. To
illustrate, while a current [SATA](/categories/sata/) disk has peak transfer rates of 375
MB/s, current RAM can do a mind blowing 12,500 MB/s! Normally only the system
itself makes use of this ultra fast storage, but we can also access this space
directly. And that opens a great window of opportunity.

<!--more-->

## Possible Uses

There is an unlimited number of uses for this technology, but here are 3 from
my own experience:

- What if you're running a blog, so successful that your server can't handle
  it. Although your blogging software caches plain html files in the /cache dir
  to speed up processing, it still doesn't give you enough [performance](/categories/performance/).
- What if you're an internet host and you want to show off your massive
  bandwidth by having users download a dummy .bin file of 100MB. You'll find
  that if many users access this file at the same time, your hard drive becomes
  the slow factor and you're running into [disk IO](/categories/io/) problems.
- What if you're running a [PXE](/categories/pxe/) server with an ISO stored on it, and an
  entire webcluster is accessing this file for installation. Again. Your hard
  drive will not be able to cope with these kinds of speeds.

If only you could store these files in memory..

## How Does It Work?

Everybody who's running a [linux](/categories/linux/) server must have seen the
[/dev/shm](/categories/devshm/) on their system.

This not a normal directory on your machine. It is intended to appear as a
mounted file system, but one which uses virtual memory instead of a persistent
storage device. The standard `/dev/shm` grows automatically as more space is
needed, but is by default limited to half of your physical RAM. If you have
2GB, it can grow to 1GB at most.

So everything you copy to that place is in fact stored in your RAM. And
*that's cool* because your RAM is about 33 times faster than your normal
filesystem!

## Let's do this

*In this article I presume you have basic knowledge of server administration, be carefull because you could really mess things up if you've got no clue what you're doing. I warned you!*

### Use Current Volume

As told before, you probably already have a `/dev/shm` on your linux system.
So just copy a file to it:

```bash
$ cp -af /root/100mb.bin /dev/shm/
```

And now it's in your RAM! Using this file will be 30 times faster than before!

### Enlarge Current Volume

But maybe the limit of half of your physical RAM just does not cut it for you.
Then you might want to increase the maximum size of this volume to 4GB:

```bash
$ mount -o remount,size=4G /dev/shm
```

### Create a New Volume

Another possibility is to create a brand new memory device. We can do this
with the filesystem type: [tmpfs](/categories/tmpfs/). Let's say you want to create a tmpfs
instance on `/var/www/www.mysite.com/ramdrive` that can allocate a max of
500MB RAM and that can only be changed by root, but accessed by everyone (like
[Apache](/categories/apache/)):

```bash
$ mkdir -p /var/www/www.mysite.com/ramdrive
$ mount -t tmpfs -o size=500M,mode=0744 tmpfs /var/www/www.mysite.com/ramdrive
```

### Restore That Volume Everytime Your Server Boots

Easy, just make sure the directory exists, and add the following line to your
`/etc/fstab`:

```bash
tmpfs /var/www/www.mysite.com/ramdrive tmpfs size=500M,mode=0777 0 0
```

This will create a ramdrive of max 500MB in */var/www/www.mysite.com/ramdrive*
everytime your server boots. The mode 0777 will give it full access for
everybody on your system, so just change that to a suitable [umask](/categories/umask/).

Done.

## Pitfalls

Now as with everything *too* cool, there is a pitfall:

- Everything in tmpfs is temporary in the sense that no files will be
  created on your hard drive. If you reboot, everything in tmpfs will be lost.

So you will need to create a script that automatically restores the files or
applications that you need. You can let this script run at boot time, or
[schedule it as a cronjob](/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/).
]]></content:encoded>
      <dc:date>2007-07-18T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Control Cache Expire Dates Using Htaccess</title>
      <link>https://kvz.io/control-cache-expire-dates-using-htaccess.html</link>
      <description><![CDATA[If you're running Squid to cache your website, you can use an
htaccess file to control what kind of files should be cached, and for how
long.
]]></description>
      <pubDate>Mon, 16 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/control-cache-expire-dates-using-htaccess.html</guid>
      <content:encoded><![CDATA[If you're running [Squid](/categories/squid/) to cache your website, you can use an
[htaccess](/categories/htaccess/) file to control what kind of files should be cached, and for how
long.

<!--more-->

## Prerequisites

First you should enable [mod\_expires](/categories/mod-expires/), the [Apache](/categories/apache/) module that can
control the Expire HTTP header in server responses:

```bash
$ a2enmod expires
```

## htaccess

Next create a `.htaccess` file in your web root, containing:

```bash
ExpiresActive On
ExpiresDefault "access plus 4 hours"
ExpiresByType application/javascript A900
ExpiresByType application/x-javascript A900
ExpiresByType text/javascript A900
ExpiresByType text/html A90
ExpiresByType text/xml A90
ExpiresByType text/css A900
ExpiresByType text/plain A62
ExpiresByType image/gif A14400
ExpiresByType image/jpg A14400
ExpiresByType image/jpeg A14400
ExpiresByType image/png A14400
ExpiresByType image/bmp A14400
ExpiresByType application/x-shockwave-flash A3600
```

And that's it! Play around a bit with the values to suit your needs.
]]></content:encoded>
      <dc:date>2007-07-16T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Install Squid & Apache on 1 Server</title>
      <link>https://kvz.io/install-squid-apache-on-1-server.html</link>
      <description><![CDATA[Let's say your site is becoming a big success and as a result it's becoming
slower and slower. There are several things you do without buying additional
hardware:
]]></description>
      <pubDate>Sun, 15 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/install-squid-apache-on-1-server.html</guid>
      <content:encoded><![CDATA[Let's say your site is becoming a big success and as a result it's becoming
slower and slower. There are several things you do without buying additional
hardware:

<!--more-->

- Clean up your code
- [Speedup your website with Cache\_Lite](/blog/2007/08/01/speedup-your-website-with-cache-lite/)
- Deploy memcached
- Deploy eAccelerator
- Optimize your database layout & [config file](/categories/config-file/).
- [Create a memory file system](/blog/2007/07/18/create-turbocharged-storage-using-tmpfs/)
- Install [Squid](/categories/squid/)

*In this article I assume you have basic knowledge of server administration,
be careful because you could really mess things up if you've got no clue what
you're doing. I warned you!*

## Squid?

If you don't have the money to buy additional hardware you should know that
it's always an option to install it on the same server that your apache runs
on. This is how it works:

- We're going to install Squid
- We're going to run [Apache](/categories/apache/) on port 8080
- We're going to run Squid on port 80
- When a request from a web browser reaches port 80, squid will first check
  if it has the result stored in memory.
- **If so**, it is:
- served to the web browser immediately without troubling the Apache
  server
- **If not**, it is:
- fetched from the Apache server
- stored in memory for the next time
- served to the web browser

Now that you have an idea of the logic behind Squid, let's put it to use!

## Let's do this

Installing squid is easy, just use your distro's package manager. On
[Ubuntu](/categories/ubuntu/) it would look like this:

```bash
$ sudo aptitude install squid
```

You can make Apache run on port 8080 by editing the file:
/etc/apache2/ports.conf

```bash
Listen 127.0.0.1:8080
```

Now let's edit squid's config file: /etc/squid/squid.conf

```bash
# Define the HTTP por
http_port _123.123.123.123_:80 vhost vport=8080 defaultsite=*www.example.com*
# Specify the local and remote peers
cache_peer 127.0.0.1 parent 8080 0 no-query originserver name=server1

# Tell squid which domains to forward to which servers
acl sitedomains dstdomain _.example.com_
cache_peer_access server1 allow sitedomains
# some restriction definitions
acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
#acl webcluster src 87.233.132.114
acl webcluster src 87.233.132.112/28
acl purge method PURGE
acl CONNECT method CONNECT

# some restrictions
http_access allow manager localhost
http_access allow manager webcluster
http_access deny manager
http_access allow purge localhost
http_access allow purge webcluster
http_access deny purge
# Make sure that access to your accelerated sites is allowed
http_access allow sitedomains
# Deny everything else
http_access deny all

# Do not cache cgi-bin, ? urls, posts, etc.
hierarchy_stoplist cgi-bin ?
acl QUERY urlpath_regex cgi-bin \?
acl POST method POST
no_cache deny QUERY
no_cache deny POST
acl apache rep_header Server ^Apache
broken_vary_encoding allow apache
refresh_pattern .              60       100%     4320

# Do not cache 404s 403s, etc
negative_ttl 0 minutes
# Debug info in cache.log?
# debug_options ALL,1 33,2

# Cache properties
_cache_mem 500 MB_
maximum_object_size_in_memory 2048 KB
cache_replacement_policy heap LRU
memory_replacement_policy heap LRU
cache_dir ufs /var/spool/squid 20000 16 256
access_log /var/log/squid/access.log squid
hosts_file /etc/hosts
```

I underlined the things you might want to change and I've placed some comments
for you to read. Some extra notes on:

- *cache\_mem 500MB*

Squid claims this much RAM, change it to fit your needs. See how much memory
is availabe on your server. Limit it to relax other processes

- *123.123.123.123*

change this to your public ip address

- *example.com*

change this to your domain name

You may want to play with the config a little more. Every site is different
and some sites just don't like it that they're being cached, but this should
definitely get you started. That reminds me, you'll have to restart the
services in order for this to work of course.

```bash
$ /etc/init.d/apache restart
$ /etc/init.d/squid restart
```

## Some Final Notes

- If your webapplication gives you a hard time, concider only to cache media
  files like jpg's, flv's etc, and have the rest directed to Apache. It's the
  safest setup, and it can still save you quite a bit [disk I/O](/categories/io/) on the
  server.
- You can use [htaccess](/categories/htaccess/) files to [control what kind of files should be
  cached, and for how long](/blog/2007/07/16/control-cache-expire-dates-using-htaccess/).
- It could be that your web statistics (awstats of webalizer maybe) display
  incorrect graphs because they parse apache log files, and the log files
  contain less records because Squid is handling a lot of them. You could:
- teach your stats program to read Squid's logfiles
- use a stats program like google analytics, which does not interfere
  because the clients direct a separate request to a stats server.
- A nice overview of Squid's configuration options can be found [here](https://www.visolve.com/squid/squid24s1/contents.php).
]]></content:encoded>
      <dc:date>2007-07-15T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Use PEAR With open_basedir and safe_mode Restrictions</title>
      <link>https://kvz.io/use-pear-with-open-basedir-and-safe-mode-restricti.html</link>
      <description><![CDATA[You want your website to be as safe as possible. So you'll typically want
Open Basedir and Safe Mode to be on. When you're in a shared hosting
environment, you'll find that any server administrator with a good sense of
security will also have these restrictions in place. However security pretty
much always limits functionality and this case is no different. Because what
if you are caged in a restricted environment, and you would still like to use
shared libraries like the ones provided by PEAR?
]]></description>
      <pubDate>Thu, 12 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/use-pear-with-open-basedir-and-safe-mode-restricti.html</guid>
      <content:encoded><![CDATA[You want your website to be as safe as possible. So you'll typically want
[Open Basedir](/categories/open-basedir/) and [Safe Mode](/categories/safemode/) to be on. When you're in a shared hosting
environment, you'll find that any server administrator with a good sense of
security will also have these restrictions in place. However security pretty
much always limits functionality and this case is no different. Because what
if you are caged in a restricted environment, and you would still like to use
shared libraries like the ones provided by [PEAR](/categories/pear/)?

<!--more-->

You will have to exclude the PEAR directories from the restrictions. This can
be done in the virtual host config as follows:

```bash
php_admin_value open_basedir /var/www/www.mysite.com/:/usr/share/php/
php_admin_value safe_mode_include_dir /usr/share/php/
```

## Explained

```bash
php_admin_value              means we are going to tell php to change a variable
open_basedir                 means change to which paths your scripts are limited
/var/www/www.mysite.com/     is    where your site can be found
:                            means "also"
/usr/share/php/              is    where your PEAR library can be found

php_admin_value              means tell php to change a variable
safe_mode_include_dir        means ignore owner mismatches in the following directory
/usr/share/php/              is    where your PEAR library can be found
```
]]></content:encoded>
      <dc:date>2007-07-12T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>What's the Deal With php_value, php_admin_flag, Etc</title>
      <link>https://kvz.io/whats-the-deal-with-php-value-php-admin-flag-etc.html</link>
      <description><![CDATA[I ran accross php value, php flag, php admin value and php admin flag in a
couple of .htaccess files, and I've used them sometimes as well by just
pasting an example, but I've never really understood why there was such a
great diversity. Couldn't php_setting X Y just handle it, and if not, what do
the admin, value and flag attributes mean?
]]></description>
      <pubDate>Wed, 11 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/whats-the-deal-with-php-value-php-admin-flag-etc.html</guid>
      <content:encoded><![CDATA[I ran accross php value, php flag, php admin value and php admin flag in a
couple of .htaccess files, and I've used them sometimes as well by just
pasting an example, but I've never really understood why there was such a
great diversity. Couldn't php\_setting X Y just handle it, and if not, what do
the admin, value and flag attributes mean?

<!--more-->

## Explained

So one time when I had nothing to do I figured let's end the doubts once and
for all, lets google :) And here's what I found:

- **php\_value name value**
   Sets the value of the specified directive. Can be used only with PHP\_INI\_ALL
  and PHP\_INI\_PERDIR type directives. To clear a previously set value use none
  as the value. Note: Don't use php\_value to set boolean values. php\_flag (see
  below) should be used instead.
- **php\_flag name on|off**
   Used to set a boolean configuration directive. Can be used only with
  PHP\_INI\_ALL and PHP\_INI\_PERDIR type directives.
- **php\_admin\_value name value**
   Sets the value of the specified directive. This can not be used in .htaccess
  files. Any directive type set with php\_admin\_value can not be overridden by
  .htaccess or virtualhost directives. To clear a previously set value use none
  as the value.
- **php\_admin\_flag name on|off**
   Used to set a boolean configuration directive. This can not be used in
  .htaccess files. Any directive type set with php\_admin\_flag can not be
  overridden by .htaccess or virtualhost directives.

## More

Source: [www.php.net/manual/en/configuration.changes.php](https://www.php.net/manual/en/configuration.changes.php)
]]></content:encoded>
      <dc:date>2007-07-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Change the Default Editor</title>
      <link>https://kvz.io/change-the-default-editor.html</link>
      <description><![CDATA[Ever wanted to change the crontab of a server, but got an editor on
screen that you're totally unfamiliar with? There are a lot of causes for this
annoyance, but one is that somebody recently installed or used midnight commander (mc) which for whatever reason seams to overrule your session's default editor.
]]></description>
      <pubDate>Wed, 11 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/change-the-default-editor.html</guid>
      <content:encoded><![CDATA[Ever wanted to change the [crontab](/categories/crontab/) of a server, but got an editor on
screen that you're totally unfamiliar with? There are a lot of causes for this
annoyance, but one is that somebody recently installed or used midnight commander (mc) which for whatever reason seams to overrule your session's default editor.

<!--more-->

## Changing the Editor

The first time it took me a while to figure it out so I thought lets make it
an article on my site, so maybe it will help others save some time.

Anyhow, here how to change it, just open your terminal and type

```bash
$ export EDITOR="vim"
```

*vim* can be [vim](/categories/vim/), nano, or another text editor of your choice of course.

## Make It Permanent

If the problem persists, you might want to add the export to your `.bashrc`
file, or even to the `/etc/profile` file. But that seems a bit radical since
that's system-wide :)

## Debian / Ubuntu

As noted by Roland in the comments, on Debian / Ubuntu you can use:

```bash
$ sudo update-alternatives --config editor
```

Much better!
]]></content:encoded>
      <dc:date>2007-07-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Cat a File, Without the Comments</title>
      <link>https://kvz.io/cat-a-file-without-the-comments.html</link>
      <description><![CDATA[I recently had to install a couple of squid servers to act as reverse proxies
for a webcluster. You can teach the squid server to stand in between in the
end users and the webservers, and to store all the static content ( .jpg .flv
.css .htm for example ) in the RAM. This saves a lot of I/O and bandwidth on
the webservers, and it can really speeds up a site. And the end of the road
the webservers' load dropped with 92%. But before all this worked, I had to
run through a massive config file and since the squid config file is their
manual at the same time, it's about 5000 lines long. So I had to find out a
way to filter only the important settings from the config file.
]]></description>
      <pubDate>Wed, 11 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/cat-a-file-without-the-comments.html</guid>
      <content:encoded><![CDATA[I recently had to install a couple of squid servers to act as reverse proxies
for a webcluster. You can teach the squid server to stand in between in the
end users and the webservers, and to store all the static content ( .jpg .flv
.css .htm for example ) in the RAM. This saves a lot of I/O and bandwidth on
the webservers, and it can really speeds up a site. And the end of the road
the webservers' load dropped with 92%. But before all this worked, I had to
run through a massive config file and since the squid config file is their
manual at the same time, it's about 5000 lines long. So I had to find out a
way to filter only the important settings from the config file.

<!--more-->

This is what I came up with:

```bash
$ cat /etc/squid/squid.conf | egrep -v "(^#.*|^$)"
```

## Explained

```bash
egrep -v      means leave the following out
^#.*          means patterns that begin with a #
|             means or
^$            means patterns that are empty
```

## Updates

### update #1

Thanks to an insightfull comment by *Darwin Award Winner* on this article,
here's a version that would also filter comments with spaces before the #,
such as comments that are indented with code blocks:

```bash
$ cat /etc/squid/squid.conf | egrep -v "^\s*(#|$)"
```

Thanks Darwin! :wink:
]]></content:encoded>
      <dc:date>2007-07-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Beautify URLs</title>
      <link>https://kvz.io/beautify-urls.html</link>
      <description><![CDATA[Readable URLs are nice. A well made website will have a logical layout, with
intelligent folder and file names, and as few technical details as possible.
In the most well designed sites, readers can guess at filenames with a high
level of success. Clean URLs are great because they:
]]></description>
      <pubDate>Wed, 11 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/beautify-urls.html</guid>
      <content:encoded><![CDATA[Readable URLs are nice. A well made website will have a logical layout, with
intelligent folder and file names, and as few technical details as possible.
In the most well designed sites, readers can guess at filenames with a high
level of success. Clean URLs are great because they:

<!--more-->

- allow search engines to better spider your site and increase your
  PageRank
- are easy to remember
- hide underlying technology, and reduce hacking attempts
- look nice
- are easier to link to
- reduce the number of typos

Now before we go any further explaining about beautified URLs, let's first see
what I mean by an **ugly URL**:

`https://yourdomain.com/index.php?cat_id=12&artcl_id=9&act=edit`

This just does not look professional, and invites people to temper with your
variables. So how can we make this into a **nice URL**:

`https://yourdomain.com/blog_posts/edit/beautify_urls/`

There are several ways, but the one that I used is with an [Apache](/categories/apache/) module
called [mod\_rewrite](/categories/mod-rewrite/). This module can take any [URL](/categories/url/) and change it
quickly before your pages are accessed. You can tell mod\_rewrite to do things
by writing an [.htaccess](/categories/htaccess/) file.

## OK, Lets do this

Fine, create a file in your web root (the directory with your main index),
call it `.htaccess` (include the dot), and add the following line that tells
Apache to enable the module I told you about:

```bash
RewriteEngine on
```

Now on the next line we need to tell the module to secretly rewrite the new &
beautiful URLs, to the ugly URLs that lie beneath (we still need those
otherwise your site won't function, right?). So lets say we wanted to rewrite:
`/blog` to `/index.php?page=blog`

```bash
RewriteRule ^([a-zA-Z0-9_]+)$ /index.php?page=$1
```

## What Just Happened?

Let's explain what all these nasty characters ([Regular Expression](/categories/regex/)) mean:

- `^` marks the beginning of the URL
- `([a-zA-Z0-9_]+)`
- `( )` try to match something between these
- `[ ]` any of the matches between these will do
- `+` try to find multiple matches
- `a-z` match all lowercase characters
- `A-Z` match all uppercase characters
- `0-9` match all numbers
- `_` let's also match the underscore character
- `$` marks the end of the URL

Everything that's matched is stored into a variable: **$1**. This variable now
contains **blog**. And we can use it to rewrite **blog** to
/index.php?page=**blog** . The beauty of this is that it also works for other
words now.

## Making It More Solid

So far for the basics. If you want to know more about Regular Expressions,
`.htaccess`, Apache, `mod_rewrite`. I suggest you look it up somewhere
else, this article is not about those.

You may find that `blog` is now secretly directed to
`/index.php?page=blog` but what about `blog/`? How about first
visibly redirect people from `blog` to `blog/` and then secretly directed
them to `/index.php?page=blog`? For this we would need the following
`.htaccess` file:

```bash
RewriteEngine on
RewriteRule ^([a-zA-Z0-9_]+)$ /$1/ [R]
RewriteRule ^([a-zA-Z0-9_]+)/$ /index.php?page=$1
```

Notice the new line in the middle? It says rewrite **blog** to **blog/** and
let the people know. This is done with the [R]. It makes it a visible rewrite.

Next secretly rewrite **blog/** to index.php?page=**blog**

## But What if I Have More Than One Variable to Pass On?

What you could do is just repeat yourself in the .htaccess file like so:

```bash
RewriteEngine on
RewriteRule ^([a-zA-Z0-9_]+)$ /$1/ [R]
RewriteRule ^([a-zA-Z0-9_]+)/$ /index.php?page=$1

RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)$  /$1/$2/ [R]
RewriteRule ^([a-zA-Z0-9_]+)/([a-zA-Z0-9_]+)/$ /index.php?page=$1&subpage=$2
```

There's a lot of things you can do with mod\_rewrite. You can add conditions,
create complicated regexes, etc. If you're interested, just google for it.

## Doesn't work?

There are 2 things that need to be in place in order for this to work:

**1.** The `.htaccess` needs to be allowed to control the Rewrite module. Make
sure the Vhost contains:

```bash
AllowOverride All
```

**2.** The Rewrite module must be enabled, in the terminal type:

```bash
$ a2enmod rewrite
```
]]></content:encoded>
      <dc:date>2007-07-11T00:00:00+02:00</dc:date>
    </item>
    <item>
      <title>Hello, World!</title>
      <link>https://kvz.io/hello-world.html</link>
      <description><![CDATA[Hello World! In my day to day I do a lot of development &amp; sysadmin research, 
often taking quick notes so I don't forget.
]]></description>
      <pubDate>Tue, 10 Jul 2007 00:00:00 +0200</pubDate>
      <guid>https://kvz.io/hello-world.html</guid>
      <content:encoded><![CDATA[Hello World! In my day to day I do a lot of development & sysadmin research, 
often taking quick notes so I don't forget. 

On this blog I intent to brush up and share those notes, hoping to save
others some work, and also to learn of better ways through your feedback.

Here goes nothing!
]]></content:encoded>
      <dc:date>2007-07-10T00:00:00+02:00</dc:date>
    </item>
    <dc:date>2022-07-26T00:00:00+02:00</dc:date>
  </channel>
</rss>