13 April 2013
Hi! It has been way too long since I’ve shared some of the awesome work that we’re doing throughout Windows Azure. I wanted to share an application that I built up a few nights ago that really highlights how things can come together easily using the open source Windows Azure SDKs that my team works on.
In this post I’m going to create both a client and cloud application that is glued together by JSON. The whole thing is written in JavaScript. I’m calling the app my “Windows Azure Table Explorer” - it is used for dynamically browsing through tabular data stored in the cloud.
The info that I’m storing in the table service is log data from my live application that is managing push notification connections from mobile devices. With a connection I log the phone manufacturer information, OS version, etc. I also log the severity of the log message - these connections are simply info. It allows me to filter out warning/error messages easily.
I’ll also be introducing a few essentials and Windows Azure basics in this post since I haven’t actually blogged very much about the work we’re doing in the cloud, so many of my blog readers may not be familiar with the products and technologies.
The Windows Azure Table Service lets you store large amounts of structured non-relational data. Behind the scenes, the Azure storage service scales to meet throughput demands. There is a partitioning story.
I’m building this table explorer app because the Windows Azure management site doesn’t display table information today, but instead just basic info about your blob containers and the blob contents. So I’m building this site to iterate through the table contents easily from any web browser. Note that the nice Visual Studio tooling that we ship also lets you query tables.
When you store data in the table, you simply list the set of key/object pairs for the row of data. You do not need to define the columns ahead of time for the entities - each can be unique; so in this situation, where I am building out a table explorer, I am sort of expecting a homogeneous set of data: I’m going to show the table of data using a superset of all the columns for a given page of data.
Each entity is a set of properties and can be up to 1MB in size. Entities can have up to 252 properties. It is actually 255 once you mentally include the system properties, PartitionKey, RowKey, and Timestamp, that Windows Azure’s storage service maintains for its use. (PartitionKey, RowKey)
is the unique pair that identifies any row within the table.
IMO one thing that I really like about the Table Service in Windows Azure is the pricing model: it is priced the same way as all other storage, including blobs and queues - a price per GB per month, and then also a very small price for transactions. I found that for my application, table was cheaper than Amazon SimpleDB, for example. It lets me store loose data in the cloud that I can query outside of a SQL server or more complicated NoSQL solution.
You can find current storage pricing details here.
I have an application in the cloud today in production that logs debug-style information to the cloud to make it easy to monitor what’s going on. The log provider I use writes to the Windows Azure Table Service through the standard Node.js module that many use for logging called Winston and an extension to support table storage.
To actually interact with Windows Azure’s services I am using the Windows Azure SDK for Node.js, a product that my dev team has done a really nice amount of work on.
The Node.js SDK is powerful, simple, and acts exactly like any Node.js module. You’ll find it on npm as azure
. It supports blob storage, table, queues, service bus, and many other services. We also have built out a cross-platform command line interface (CLI) for Windows Azure that you can grab in the azure-cli
module - more on that another day.
All interactions with the table service are done through a simple table service instance that lets you query entities, insert entities, etc.
var azure = require('azure'); var tableService = azure.createTableService();
You’ll find a good set of documentation on this in the Wiki on GitHub or by walking through the Node + table service tutorial on WindowsAzure.com
If you’re looking to get in touch with members of the SDK team, many of us are on Twitter and we have a list of Twitter handles on our site about open source.
For almost all of my web projects today I just use Windows Azure Web Sites. The functionality as of April is in Preview mode still, but I’ve had great success in terms of uptime, availability and functionality. The key features I love about the Web Sites product:
master
branch on GitHub, the Windows Azure Web Site will automatically update and deploy the latest good version of my appOn the scaling side for my projects I use a set of multiple small instances that are dedicated to my projects. This way incoming requests are automatically load balanced among several virtual machines, yielding great performance, a little redundancy, and the flexibility to handle traffic quite well.
To use services associated with a storage account, you use a key pair: the account name and an access key. This information is important to keep safe.
You’ll find the keys inside the Windows Azure management portal and should take care to protect them: only send them using secure SSL connections, or store them in the cloud/server side in a secure manner.
If you do this, then the site will greet you right away with a choice of tables as it does not required authentication/credentials. This is really only good for testing or temporary use while developing an app that uses the table service.
I’m building the web service side using Node.js. JavaScript is ironically becoming so nice.
For this project, I’m using Express 3, a fairly typical and friendly way to build up a simple web site complete with routes, middleware for processing requests, and serving static files as well if needed.
So my full set of dependencies include:
npm install express -g && express
npm install azure
You can explore the complete source code of the running application on GitHub at https://github.com/jeffwilcox/azure-table-explorer.
To expose JSON-serving endpoints for my web app, I’m creating a small Express submodule for my site. Express is a super easy way to do this: it natively recognizes JSON requests and can respond with a simple JSON payload, making AJAX sites super easy to build out.
What I am doing in this app is actually exposing this sub-route for JSON as an Express module itself. This way, it can easily be added to another web site using Express through a simple import.
So for my Express JSON routes, I just expose 2 GET operations: getting the name of all the tables in my storage account, and then also one to work on returning a set of rows from a table. I then have a DELETE operation for removing rows from a table.
var express = require('express') , app = module.exports = express() , queryTables = require('./json/queryTables') , queryTable = require('./json/queryTable') , deleteRow = require('./json/deleteRow'); app.get('/table', queryTables); app.get('/table/:tableName', queryTable); app.delete('/table/:tableName/:partitionKey/:rowKey', deleteRow);
Now if you look at this code, you will see a few things if you’re familiar with Express:
request.params.*
, row tableName, partitionKey, etc. These are non-optional in my use cases here.require
statement.We’ll open up one of those sub files like queryTables.js in a little bit…
One thing that I decided I wanted to do was allow the application to work in one of two ways:
This is easy to do since the Azure SDK will look for credentials using environment variables first. The magical environment variable names are AZURE_STORAGE_ACCOUNT
and AZURE_STORAGE_ACCESS_KEY
. This makes it easy to deploy your app using source control, such as how I have done with GitHub: the credentials are stored in the cloud configuration and not in source code.
So these lines of code are the same when you use the environment variables:
var azure = require('azure'); var tableService = azure.createTableService(); // implicitly use env variables tableService = azure.createTableService( process.env.AZURE_STORAGE_ACCOUNT, process.env.AZURE_STORAGE_ACCESS_KEY); // explicit
So to do this, whenever a request comes in for the /json/*
routes of my application, I’m going to use a piece of simple middleware to look for query string parameters for account key and access key. If I find them, I use them.
Here’s the Express middleware I am exposing. I am wrapping the createTableService call in a try/catch as the function will perform some basic validation, such as checking for a base64 access key - it will throw if these conditions are not met.
var azure = require('azure'); module.exports = function getTableServiceInstance(req, res, next) { var account = process.env.AZURE_STORAGE_ACCOUNT || req.query["account"]; var key = process.env.AZURE_STORAGE_ACCESS_KEY || req.query["key"]; if (account === undefined && key === undefined) { return res.send(403); } try { req.tableService = azure.createTableService(account, key); } catch (err) { return res.send(403); } next(); }
And then to plug this in to the routes, here is the routing source I am using with Express for my json endpoints:
var express = require('express') , app = module.exports = express() , queryTables = require('./json/queryTables') , queryTable = require('./json/queryTable') , deleteRow = require('./json/deleteRow') , tableServiceMiddleware = require('./json/tableServiceMiddleware'); app.use(tableServiceMiddleware); app.get('/table', queryTables); app.get('/table/:tableName', queryTable); app.delete('/table/:tableName/:partitionKey/:rowKey', deleteRow);
So by calling app.use(...)
with the tableServiceMiddleware
code I created, I’m actually going to automatically create the table service and place it inside the request
object for all of the code in the /json
paths on my server. Very simple, and it really cuts down on code duplication and provides a single code path for instantiating the Windows Azure SDK.
Now for the raw implementation details, I’d like to fill out the source code for what will be returned when someone hits the /json/table
endpoint of my web service with either authenticated credentials or cloud-configured credentials.
For this I know that tableService
will be a property already hanging off of the request
object that Express is now using, thanks to the simple middleware I have written.
So the actual implementation is super quick:
queryTables
function that the Azure table service object exposessend
code will send back a response that my AJAX app can use to share the error message.Here’s the entire file for routes/json/queryTables.js
:
var send = require('./send'); module.exports = function queryTables (req, res, next) { req.tableService.queryTables(function (error, result, response) { send.errorElse(res, error, function () { var tables = []; for (var tbl in result) { var table = result[tbl]; if (table.TableName) { tables.push(table.TableName); } } send.content(res, { tables: tables, name: req.tableService.storageAccount, }, 'result'); }) }); }
Now we are ready to start interacting with this locally. No matter what platform you are using, Node.js is awesome at this. The development environment for Express is nice, too, since it will actually show you some nice, simple diagnostics information in your terminal window about any and all requests.
All I need to do in my terminal window is run node server
and it will attach locally. Here’s what it looks like as I use the site from a command prompt:
Now, I’m getting ahead of myself a little here, but realize that I’m using a local HTTP endpoint without any encryption. In my app I have added special code to warn users that their credentials will not be securely stored in this situation: here is what that looks like:
So locally it is easy to work on debugging the AJAX app, server code, you name it.
Now to actually get this from my local dev environment to the cloud, well this is getting so easy now. I’ve been coding up this project as a Git repo, with its origin being in GitHub.
So locally, I just do a git push origin master
call, use my Git account credentials, and then the source is pushed to GitHub.
Windows Azure Web Sites automatically gets notified of this change and then will update the Node.js deployment to re-launch the site. For now I’ve called my Azure web site waztable
and it is deployed in just a few seconds to waztable.azurewebsites.net
. As part of this process, for a Node.js application, it looks at the package.json
file and will go ahead and automatically npm install
both the Azure and express modules for me.
For the scenario where I want to store the credentials in the cloud, I can do this really quickly right inside the management portal for my Azure Web Site.
Just go to the CONFIGURE tab inside the site, and under app settings, you just add the environment variable key/value pairs.
Easy! Since my data is not mission critical, just logging data, and also not personal or private information, I’m OK having this unsecured for demo purposes - but do realize I’m exposing the entire set of table data and the ability to delete rows.
Realistically I would want to wrap this with another more traditional authentication mechanism.
For my app to be secure when I am not storing the credentials in the cloud configuration, I need to prompt the user and then have this information securely sent with every AJAX HTTP request. Clearly, I need SSL encryption.
Web Sites supports HTTPS: azurewebsites.net
has a wildcard certificate that encrypts and allows for this scenario.
Earlier you saw the “scary” version of my app’s entrance page - when you are using a non-SSL connection, it shows a big orange alert. For local debugging purposes I allow the credentials to be entered.
Now, when I point my browser to https://waztable.azurewebsites.net/, I instead get greeted with this nice, simple, non-scary dialog.
To interact with the web service, I’m creating a single-page application that is a static HTML file, built on a lot of great JavaScript frameworks and Bootstrap. All my other client logic will be in a simple application-specific JavaScript file.
I’m using these client goodies to make development quick and easy:
This will save a lot of time and just let me focus on my JavaScript client code. Since it is a single page application, things are easy, but clearly you’ll need a modern HTML5-compliant browser and JavaScript enabled.
For serving these frameworks I am using appropriate CDN locations: jQuery from one of its CDN spots, and then for the frameworks I’m using like Bootstrap with a custom theme, I’ve setup my own Windows Azure CDN endpoint for these assets.
Then, in the public
folder of my Express site, I am placing the app-specific CSS and JavaScript files to make local debuggin easy. I have an index.html, table.js, and table.css file - thats it for this app on the client side. You can explore the public/ folder on GitHub.
This isn’t designed to be a tutorial on jQuery or using AJAX. This isn’t 2003 :-)
. I am also doing a lot of jQuery .html(...)
shortcuts in this app. When I wrote this simple app, I was more focused on the cloud code, so I’ll admit my JavaScript on the client isn’t the prettiest or most well-formatted. But it works!
Once the page is ready, jQuery will let me go ahead and start writing code to make things happen. In my init code I do a lot of things:
I also hook up a global AJAX error event with jQuery. This will make it easy to at least display a message or show the credentials entry portion of the UI. Here’s my AjaxError code.
$(document).ajaxError(function (event, request, settings) { ++ajaxErrorCount; var subject = 'Error'; var text = 'Something went wrong. Oops.'; if (request.status == 403) { credentialsNeeded(); if (ajaxErrorCount == 1) { return; } subject = '403 Unauthorized'; text = 'An unauthorized response was returned. Most likely this means:
Also: you can configure environment variables on the cloud side to automatically authenticate the table explorer.
'; } else { text = request.statusText || request.responseText; } showError(subject, text); });This will just show a modal Bootstrap dialog with the error information whenever it comes back.
The last thing I do is make an AJAX request to list all the tables in the storage account. The logic here is simple: If I receive a HTTP 403 response, the storage account credentials are incorrect or not yet entered. Show the credential UI. Otherwise, show the main UI.
We’ve already looked at the code exposed in the /json/table
endpoint to return all of the tables in my storage account. Let’s wire that up in my application code.
var loadTablesList = function () { $.ajax({ url: "/json/table", data: getStandardAjaxData(), success: function (data) { showErrorElse(data, function () { $('#results').show(); dr = data.result; if (dr.name) { storageAccount = dr.name; $('#accountName').html(storageAccount); } window.tables = []; var tableList = $("#tableList"); tableList.empty(); var html = ''; tableList.append(html); }); }}); }
So to walk through the code,
$.ajax
jQuery call is used to hit the /json/table
endpoint that was exposed by Express on the server sidenav-list
and add list items for each tableTo actually show table results, I have a general function that is pretty horrible JS code. It iterates through the list of rows for the current view, creates a table by hand in HTML (I didn’t want to take on a complete jQuery data grid solution for this, but probably should have), etc.
The function takes in the tableName, current page number, and the continuation token to use if we are requesting the next page:
function loadTable(tableName, page, continuation) { currentTableName = tableName; var data = getStandardAjaxData(); if (continuation) { data.nextRowKey = continuation.nextRowKey; data.nextPartitionKey = continuation.nextPartitionKey; } $.ajax({ url: "/json/table/" + tableName, data: data, success: function (data) { showErrorElse(data, function () { loadTableData(tableName, data.table, page); }); } }); }
I also store locally the view’s rows in JavaScript so that I can show detailed information such as the system properties for rows including the partition and row key data. I hook up a row click event to this code.
I also store an array in code of all the currently selected rows, similar to the Outlook.com user interface. This will allow for multi-select functions like delete.
The loadTableData
function does all the dirty HTML work of building out the table and wiring up events.
When the table service sends back a result, it optionally also includes a continuation token that has information about the next page of data: the next page’s starting PartitionKey and RowKey. These are used for walking an entire table’s results and effectively mean “there is another page of data to show”.
In this scenarios, I want to make sure that the pagination UI in the app is updated and that I wire up an event to request the next set of data. That data is sent in the optional continuation
parameter to the loadTable
function above.
A continuation token from the Node SDK is very simple:
{ nextPartitionKey: "NEXT_PARTITION_KEY_HERE", nextRowKey: "NEXT_ROW_KEY" };
So in the pagination list of pages, each of the associated “pages” (a concept local just to the app) is actually a pair: (PartitionKey, RowKey)
.
On the server side, this is what the code looks like for our queryTable
JSON endpoint in Node.js. The code uses SELECT statement, plus optional TOP (# of max results to return) as well as optional starting Continuation Token. It then iterates through and returns this raw JSON data to the AJAX app:
var send = require('./send') , azure = require('azure') , TableQuery = azure.TableQuery; module.exports = function queryTable (req, res, next) { var top = req.query.top; var query = TableQuery .select() .from(req.params.tableName); if (top) { query = query.top(top); } if (req.query.nextRowKey && req.query.nextPartitionKey) { query = query.whereNextKeys(req.query.nextPartitionKey, req.query.nextRowKey); } req.tableService.queryEntities(query, function (error, result, continuation, response) { send.errorElse(res, error, function () { var table = { }; if (continuation && continuation.nextPartitionKey && continuation.nextRowKey) { table.continuation = { nextPartitionKey: continuation.nextPartitionKey, nextRowKey: continuation.nextRowKey }; } table.rows = result; send.content(res, table, 'table'); }); }); }
The continuation token information is returned with the JSON results as well, enabling the UI to show the pagination page numbers and make those additional requests.
The local JavaScript variable selectedRows
keeps a list of the row numbers in the current view that are selected, allowing me to lookup that data for uniquely identifying it.
I have a delete function that then takes this row, sending HTTP DELETE requests to my service. Once all the DELETE calls are done, I then refresh the current page’s view using the same continuation token as before.
At this time I have not implemented a batch delete operation.
One thing that the user interface does not show is a concrete number of rows remaining in the table. Instead, with the pagination model, it simply shows the next page when there is information about the next page’s row and partition keys. This is an implementation detail of the Windows Azure table service: there is no “number of rows” property that is returned.
In order to show count, I would have to walk through and scan the entire table: something that could be very long and expensive for something like a web app that is always logging hundreds of thousands of fields of interest, for example. Although I could do this behind the scenes in a long-running AJAX call, or instead occassionally generate for the table, I’ve decided not to worry about it today.
This application is live and hosted on Azure Web Sites in April 2013. I’ll probably eventually need to move the endpoint elsewhere, but for now, you will find the app live at https://waztable.azurewebsites.net/
The open source code is online at https://github.com/jeffwilcox/azure-table-explorer
When I have time I’d like to add the ability to clear all rows from a table (delete and re-create the table), create a new table through the site, or add a basic entity to the table. I also thought about directly hitting table storage through AJAX instead of going through the server that then pipes back JSON to the app - but for now, I don’t see this as a huge issue with the app. I also thought about adding the ability to get a one-time approximation of the number of rows in the table.
Have a few minutes? Check out our open source work at http://windowsazure.github.io/.
We actually accept contributions. We license everything Apache 2. We are on GitHub. Please join our community today!
If you want to learn more, check out:
Hope this helps, and let me know what you think on Twitter!
Jeff Wilcox is a Software Engineer at Microsoft in the Open Source Programs Office (OSPO), helping Microsoft engineers use, contribute to and release open source at scale.