After the research for my last post, I’ve gotten a bit more curious into micro-loan services like Kiva. I have been a Kiva member for over 10 years now, and love reading the stories of the folks I’m helping. Kiva is also more transparent than the non-profits I’ve donated to. I can set how much of my loan should go to the person in need and how much Kiva can use to run the operations.
Kiva also has a public API, all GET calls, which opens up their data up to additional surfaces & analysis.
So today’s exercise is to get the data from their loans API into a local store so that I can run my own analysis. And I’d like to do this using MongoDB & NodeJS – just to continue my learning in those domains as well.
Step 1: Getting the data
The Kiva API documentation is pretty straight forward. I want to get the most recent loans for this exercise. The default number of records returned per call is 20 – and at the time of this writing, I would need to call 295 times to get all records. However, the max per_page attribute is 500, making it only a manageable 13 calls.
I decided against a script for this, for now, and performed the 13 calls manually… grabbing the output and placing it in a file. Make sure the calls indicate “.json” otherwise you’ll get xml crap. No pre-auth needed:
curl http://api.kivaws.org/v1/loans/newest.json?per_page=500&page=1
Remove the paging json object as you build out the file:
{"paging":{"page":1,"total":5893,"page_size":20,"pages":295}
And lastly, make sure the file has the right array format. If you are following along & use the same dataset, grab it here.
Step 2: Loading the data
Mongo makes this super simple. Step 1 got our data in just the right format, all I need to do is call mongoimport. The jsonArray parameter ensures these are seen as separate loan records and not one giant data set.
mongoimport --db kivaDB -c loans --file loans.json --jsonArray
Step 3: Determining the schema
To start querying the data in node-js it’s best if we use mongoose… otherwise I would be writing a lot of code just to do the mongoDB calls. What makes mongoose so powerful is that it provides schema definition and enforcement that can be easily integrated with the rest of my nodeJS app.
Ritesh Kumar provided a nice short-cut to getting the mongoose schema based on json data. Using transform.now.sh, and one record from the json test data file, I can get the mongoose schema. Right now, I am just interested in getting the nesting & data types correct. Later on, as I build out an app, I may provide additional enforcement right in the schema.
Step 4: Validating the data
With the data loaded, we can start querying it. See comments inline to follow along.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var mongoose = require('mongoose'), | |
Schema = mongoose.Schema; | |
//connect to local datastore | |
mongoose.connect('mongodb://localhost/kivaDB'); | |
mongoose.connection.on('open', function () { | |
console.log('Mongoose connected'); | |
}); | |
//set schema, based on https://transform.now.sh/json-to-mongoose | |
var loan = new Schema({ | |
loans: { | |
id: { | |
type: 'Number' | |
}, | |
name: { | |
type: 'String' | |
}, | |
description: { | |
languages: { | |
type: [ | |
'String' | |
] | |
} | |
}, | |
status: { | |
type: 'String' | |
}, | |
funded_amount: { | |
type: 'Number' | |
}, | |
basket_amount: { | |
type: 'Number' | |
}, | |
image: { | |
id: { | |
type: 'Number' | |
}, | |
template_id: { | |
type: 'Number' | |
} | |
}, | |
activity: { | |
type: 'String' | |
}, | |
sector: { | |
type: 'String' | |
}, | |
themes: { | |
type: [ | |
'String' | |
] | |
}, | |
use: { | |
type: 'String' | |
}, | |
location: { | |
country_code: { | |
type: 'String' | |
}, | |
country: { | |
type: 'String' | |
}, | |
town: { | |
type: 'String' | |
}, | |
geo: { | |
level: { | |
type: 'String' | |
}, | |
pairs: { | |
type: 'String' | |
}, | |
type: { | |
type: 'String' | |
} | |
} | |
}, | |
partner_id: { | |
type: 'Number' | |
}, | |
posted_date: { | |
type: 'Date' | |
}, | |
planned_expiration_date: { | |
type: 'Date' | |
}, | |
loan_amount: { | |
type: 'Number' | |
}, | |
borrower_count: { | |
type: 'Number' | |
}, | |
lender_count: { | |
type: 'Number' | |
}, | |
bonus_credit_eligibility: { | |
type: 'Boolean' | |
}, | |
tags: { | |
type: 'Array' | |
} | |
} | |
}); | |
//set the model | |
var loanModel = mongoose.model('Loan', loan); | |
//perform search – similar to select * from loans limit 5; | |
loanModel.find({},{},{limit:5,sort:{timestamp:–1}}, | |
function(err,ls) { | |
if(err){throw err;} | |
//print out to make sure we got 5 back | |
console.log(ls.length); | |
//iterate through the 5 and print specific items from the data-set for spot-checking | |
for (var i = 0; i<ls.length; i++) { | |
var loanitem = ls[i].toObject(); | |
console.log(loanitem.id); | |
console.log(loanitem.loan_amount); | |
console.log(loanitem.name); | |
} | |
}); |
That’s it! I can now export the loanModel to use throughout the rest of my nodeJS app.
Leave a Reply