PMXBOT Log file Viewer

Help | Karma | Search:

#dcpython logs for Friday the 2nd of December, 2011

(Back to #dcpython overview) (Back to channel listing) (Animate logs)
[16:07:42] <joshfinnie> Anyone see this? http://opendatahackdc.eventbrite.com/
[16:08:28] <joshfinnie> Here is the list of ideas they are going to be throwing around: http://www.opendataday.org/wiki/App_Ideas
[16:45:35] <aclark> j00bar: redis q?
[16:45:53] <j00bar> pssht. what makes you think i know anything about redis?
[16:45:57] <j00bar> oh wait.
[16:46:00] <j00bar> um... yeah. go ahead.
[16:46:00] <aclark> joshfinnie: rings a bell yeah
[16:48:04] <aclark> j00bar: heh ok i want to start pushing form results into redis, fields include firstname, lastname, email, and X number of trove classifiers selected. i'm still a bit confused by the key syntax and what i do with it later. i know i can create a key alex.clark:aclark@aclark.net and that's as far as i've gotten ;-)
[16:48:26] <aclark> x number of trove classifiers sounds like a list
[16:48:44] <aclark> each key needs to be associated with that list somehow
[16:49:02] <j00bar> aclark: you've got a couple of options and it really depends on how you want to/need to use your data.
[16:49:43] <j00bar> key structure is entirely up to you. one thing i'd encourage you to do is to use hierarchical key structure and use a easily identifiable delimiter
[16:49:47] <aclark> j00bar: for now i just want to collect it, eventually i'll be checking the data when people try to sign in to a beta program.
[16:49:55] <aclark> ok
[16:49:59] <j00bar> we end up using double-colons -- e.g. adgroup::stats:::12345
[16:50:03] <aclark> oic
[16:50:05] <aclark> yeah
[16:50:07] <j00bar> (ignore the triple colon)
[16:50:13] <aclark> got it
[16:50:18] <j00bar> there's a redis method to do glob-based searching of keys...
[16:50:21] <aclark> then how do you use each element e.g. 12345 ?
[16:50:25] <aclark> ah right
[16:50:33] <j00bar> so in management operations, we can ask keys adgroup::stats::*
[16:50:49] <j00bar> so anyway
[16:50:52] <j00bar> you have two options really.
[16:50:53] <aclark> ah
[16:50:58] <j00bar> you can push structured, serialized data into redis
[16:51:08] <j00bar> e.g. json.dumps(dict_obj)
[16:51:17] <aclark> neat
[16:51:47] <j00bar> or if your object is predictably structured, you can push multiple keys for the same "object"
[16:52:13] <j00bar> e.g. user::alex.chark::firstname, user::alex.clark::email
[16:52:22] <j00bar> the difference depends entirely on your application
[16:52:33] <j00bar> one is a single redis query that you process in python
[16:52:45] <j00bar> one is multiple redis queries that you can select elements from in redis
[16:53:06] <j00bar> so if you ever *just* want emails, it may make more sense to use multiple keys
[16:53:19] <j00bar> if you only ever want complete user records, serialized structures may make more sense
[16:53:54] <j00bar> which is faster? eh.
[16:53:59] <j00bar> again, depends on the data you want.
[16:54:12] <j00bar> pipelining 6 queries for one record is pretty damn fast.
[16:54:29] <j00bar> json serialization/deserialization - also pretty damn fast.
[16:54:38] <j00bar> but each method provides different flexibility
[16:54:55] <aclark> well in python i'd have users['alex.clark']['email'] = 'aclark@aclark.net' and users['alex.clark']['troves'] = ['Programming :: Python', 'Animals :: Cats'] and so on
[16:54:56] <j00bar> serializing gives you more flexibility with structure and you can add other keys later easily
[16:55:17] <j00bar> breaking down into multiple keys gives you faster access to individual items without having to query everything
[16:55:31] <aclark> right
[16:55:36] <j00bar> if you go the latter route, for troves, i'd use a set
[16:55:49] <j00bar> use sets when you don't want duplicates and when you don't care about order
[16:56:02] <aclark> right
[16:56:08] <j00bar> use lists when you do care about insertion order
[16:56:28] <j00bar> sets also give you great comparison operators
[16:56:52] <aclark> so my keys are user::name::email and then i associate that with a list or set somehow?
[16:57:00] <aclark> (that contains the troves)
[16:57:15] <j00bar> you'd have user::$USERNAME::troves
[16:57:25] <j00bar> and that would be a set or a list
[16:57:26] <aclark> i guess via zadd(key, troves)
[16:57:27] <aclark> ah
[16:57:32] <j00bar> zadd is for scored set
[16:57:34] <j00bar> you want sadd
[16:57:52] <j00bar> these troves are per user?
[16:57:56] <aclark> yes
[16:58:00] <aclark> form selection values
[16:58:13] <j00bar> like... "contractor" or "django" or "probably-human"?
[16:58:31] <aclark> based on a subset of http://pypi.python.org/pypi?%3Aaction=list_classifiers
[16:58:40] <j00bar> but those are software classifiers
[16:58:50] <j00bar> how can a person be production/stable?
[16:58:58] <aclark> subset, not all of them
[16:59:02] <j00bar> oh
[16:59:03] <j00bar> well
[16:59:03] <MattBowen> j00bar: i am production/stable
[16:59:05] <aclark> only then ones that make sense
[16:59:12] <j00bar> MattBowen: you'll be mature some day, i hope
[16:59:18] <MattBowen> j00bar: unlikely
[16:59:25] <j00bar> aclark: the fun part about sets is that you can then use this for search
[16:59:32] <aclark> j00bar: you have the right idea, e.g. i like "django", "cats", "beer" select all that apply
[17:00:07] <aclark> j00bar: thanks!
[17:00:09] <j00bar> if somebody wants to find all people who have "gtk" and "education" you can use the set comparison operations to see if a given trove set applies
[17:00:41] <j00bar> so if that's an operation you care about, breaking out your structures would probably make more sense
[17:00:48] <j00bar> versus obtaining *all* records and parsing them in python
[17:44:49] <aclark> right
[18:25:36] <aclark> j00bar: so a key can be used in any of the data types to look up data right? i.e. the key/value store bit aka nosql