pmxbot IRC Log Viewer

[16:07:42] <joshfinnie> Anyone see this? http://opendatahackdc.eventbrite.com/

[16:08:28] <joshfinnie> Here is the list of ideas they are going to be throwing around: http://www.opendataday.org/wiki/App_Ideas

[16:45:35] <aclark> j00bar: redis q?

[16:45:53] <j00bar> pssht. what makes you think i know anything about redis?

[16:45:57] <j00bar> oh wait.

[16:46:00] <j00bar> um... yeah. go ahead.

[16:46:00] <aclark> joshfinnie: rings a bell yeah

[16:48:04] <aclark> j00bar: heh ok i want to start pushing form results into redis, fields include firstname, lastname, email, and X number of trove classifiers selected. i'm still a bit confused by the key syntax and what i do with it later. i know i can create a key alex.clark:aclark@aclark.net and that's as far as i've gotten ;-)

[16:48:26] <aclark> x number of trove classifiers sounds like a list

[16:48:44] <aclark> each key needs to be associated with that list somehow

[16:49:02] <j00bar> aclark: you've got a couple of options and it really depends on how you want to/need to use your data.

[16:49:43] <j00bar> key structure is entirely up to you. one thing i'd encourage you to do is to use hierarchical key structure and use a easily identifiable delimiter

[16:49:47] <aclark> j00bar: for now i just want to collect it, eventually i'll be checking the data when people try to sign in to a beta program.

[16:49:55] <aclark> ok

[16:49:59] <j00bar> we end up using double-colons -- e.g. adgroup::stats:::12345

[16:50:03] <aclark> oic

[16:50:05] <aclark> yeah

[16:50:07] <j00bar> (ignore the triple colon)

[16:50:13] <aclark> got it

[16:50:18] <j00bar> there's a redis method to do glob-based searching of keys...

[16:50:21] <aclark> then how do you use each element e.g. 12345 ?

[16:50:25] <aclark> ah right

[16:50:33] <j00bar> so in management operations, we can ask keys adgroup::stats::*

[16:50:49] <j00bar> so anyway

[16:50:52] <j00bar> you have two options really.

[16:50:53] <aclark> ah

[16:50:58] <j00bar> you can push structured, serialized data into redis

[16:51:08] <j00bar> e.g. json.dumps(dict_obj)

[16:51:17] <aclark> neat

[16:51:47] <j00bar> or if your object is predictably structured, you can push multiple keys for the same "object"

[16:52:13] <j00bar> e.g. user::alex.chark::firstname, user::alex.clark::email

[16:52:22] <j00bar> the difference depends entirely on your application

[16:52:33] <j00bar> one is a single redis query that you process in python

[16:52:45] <j00bar> one is multiple redis queries that you can select elements from in redis

[16:53:06] <j00bar> so if you ever *just* want emails, it may make more sense to use multiple keys

[16:53:19] <j00bar> if you only ever want complete user records, serialized structures may make more sense

[16:53:54] <j00bar> which is faster? eh.

[16:53:59] <j00bar> again, depends on the data you want.

[16:54:12] <j00bar> pipelining 6 queries for one record is pretty damn fast.

[16:54:29] <j00bar> json serialization/deserialization - also pretty damn fast.

[16:54:38] <j00bar> but each method provides different flexibility

[16:54:55] <aclark> well in python i'd have users['alex.clark']['email'] = 'aclark@aclark.net' and users['alex.clark']['troves'] = ['Programming :: Python', 'Animals :: Cats'] and so on

[16:54:56] <j00bar> serializing gives you more flexibility with structure and you can add other keys later easily

[16:55:17] <j00bar> breaking down into multiple keys gives you faster access to individual items without having to query everything

[16:55:31] <aclark> right

[16:55:36] <j00bar> if you go the latter route, for troves, i'd use a set

[16:55:49] <j00bar> use sets when you don't want duplicates and when you don't care about order

[16:56:02] <aclark> right

[16:56:08] <j00bar> use lists when you do care about insertion order

[16:56:28] <j00bar> sets also give you great comparison operators

[16:56:52] <aclark> so my keys are user::name::email and then i associate that with a list or set somehow?

[16:57:00] <aclark> (that contains the troves)

[16:57:15] <j00bar> you'd have user::$USERNAME::troves

[16:57:25] <j00bar> and that would be a set or a list

[16:57:26] <aclark> i guess via zadd(key, troves)

[16:57:27] <aclark> ah

[16:57:32] <j00bar> zadd is for scored set

[16:57:34] <j00bar> you want sadd

[16:57:52] <j00bar> these troves are per user?

[16:57:56] <aclark> yes

[16:58:00] <aclark> form selection values

[16:58:13] <j00bar> like... "contractor" or "django" or "probably-human"?

[16:58:31] <aclark> based on a subset of http://pypi.python.org/pypi?%3Aaction=list_classifiers

[16:58:40] <j00bar> but those are software classifiers

[16:58:50] <j00bar> how can a person be production/stable?

[16:58:58] <aclark> subset, not all of them

[16:59:02] <j00bar> oh

[16:59:03] <j00bar> well

[16:59:03] <MattBowen> j00bar: i am production/stable

[16:59:05] <aclark> only then ones that make sense

[16:59:12] <j00bar> MattBowen: you'll be mature some day, i hope

[16:59:18] <MattBowen> j00bar: unlikely

[16:59:25] <j00bar> aclark: the fun part about sets is that you can then use this for search

[16:59:32] <aclark> j00bar: you have the right idea, e.g. i like "django", "cats", "beer" select all that apply

[17:00:07] <aclark> j00bar: thanks!

[17:00:09] <j00bar> if somebody wants to find all people who have "gtk" and "education" you can use the set comparison operations to see if a given trove set applies

[17:00:41] <j00bar> so if that's an operation you care about, breaking out your structures would probably make more sense

[17:00:48] <j00bar> versus obtaining *all* records and parsing them in python

[17:44:49] <aclark> right

[18:25:36] <aclark> j00bar: so a key can be used in any of the data types to look up data right? i.e. the key/value store bit aka nosql

Log file Viewer

Help | Karma | Search:

#dcpython logs for Friday the 2nd of December, 2011