PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 11th of March, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:16:47] <Boomtime> _blizzy_: hi there
[00:16:56] <_blizzy_> hi Boomtime.
[00:17:25] <Boomtime> not sure what you are trying to do with that gist, it isn't valid JSON certainly - maybe try people = [ ... ]
[00:18:05] <Boomtime> did you have a question about mongodb?
[00:18:16] <_blizzy_> yes, how would I explain MongoDB using json.
[00:21:13] <cheeser> what?
[00:41:23] <_blizzy_> cheeser, is it possible to explain how a mongodb database would look like using a dictionary or json?
[00:43:13] <Boomtime> _blizzy_: you can use mongoexport to get a JSON dump of a mongodb database.. is that what you are after?
[00:43:51] <_blizzy_> Boomtime, I understand that SQL are tables, but I'm having a hard time wraping my head around what a mongodb database would look like.
[00:44:08] <GothAlice> _blizzy_: Many of the concepts are similar.
[00:44:57] <GothAlice> _blizzy_: See: http://docs.mongodb.org/manual/reference/sql-comparison/ and http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
[00:49:07] <GothAlice> _blizzy_: Using your JSON comparison, this is a better sample, using JavaScript notation. (Works in Python, too.)
[00:49:10] <GothAlice> https://gist.github.com/amcgregor/b7e4d3f399e81e561c39
[00:51:57] <GothAlice> This highlights a few things: "records" (documents) in a collection don't need to be exactly the same in their fields. There are no schemas, though it's generally a good idea to keep things pretty similar. (In this case, a "gender" of "r", robot, has an extra field, "reference".) Also, objects can nest "complex" datatypes, like lists (arrays), and even other dicts (called "embedded documents").
[00:54:29] <_blizzy_> GothAlice, that makes so much sense now. thank you.
[00:54:35] <_blizzy_> also, thank you, Boomtime.
[00:55:01] <GothAlice> For slightly more advanced reading, see: http://docs.mongodb.org/manual/data-modeling/, and, if you're coming from a relational/SQL world, see: http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html
[00:55:07] <GothAlice> :)
[00:55:18] <_blizzy_> thanks again, GothAlice.
[01:00:34] <Zyphonic> Anyone here familiar with pymonog and setting up custom datatypes for read/write? I went off of this article here - http://api.mongodb.org/python/current/examples/custom_type.html but on the API docs for the database type said it's being deprecated in 3.0
[01:01:03] <Zyphonic> http://api.mongodb.org/python/current/api/pymongo/database.html#pymongo.database.Database.add_son_manipulator is the link to the method that's deprecated
[04:01:12] <Jonno_FTW> anyone in?
[04:01:47] <Jonno_FTW> is is possible to query by date format? say I wanted all records that match this particular month
[04:01:53] <Jonno_FTW> regardless of year
[04:11:32] <Jonno_FTW> or do I have to use $where
[04:22:06] <cheeser> aggregations
[04:33:15] <bufferloss> hi
[04:34:30] <joannac> hi
[04:38:14] <Streemo> I want my objects to have multiple Ids. Because I am keeping track of all individual instances. When I serve a page, I make a query to find an object based on some given instance ID. IS it faster to have the object contain an instanceID array field? or would it be faster to have an entirely different collection and do two queries?
[04:38:56] <joannac> how many isntances per document?
[04:39:12] <Streemo> goofd question
[04:39:14] <Streemo> ummm
[04:39:33] <Streemo> well it depends on how many people share it
[04:39:43] <Streemo> for each share, i make note that that peson shared it
[04:40:00] <Streemo> so itcould be 0 it could be 100
[04:40:14] <Streemo> probably no more than 500
[04:40:19] <Streemo> 500 is an outlier probably
[04:40:28] <Streemo> i mean it would have to go viral in a local network on FB
[04:40:39] <Streemo> i share it, 5 of my friends share it,4 or 5 of theirs share it, etc..
[04:40:57] <Streemo> but theres a cap on how many instances can be spawned so it would at most be like 500
[04:41:11] <Streemo> does that answer ??
[04:43:59] <Streemo> i mean its really coming down to db.objects.findOne(instanceId in instanceArrayField) Versus: db.objects.findOne((_id: db.instances.findOne({_id: instanceId}).objectId})
[04:44:12] <Streemo> depending on if i normalize or denomalize my data
[04:44:19] <Streemo> repsecitlvey
[04:44:24] <Streemo> i mean
[04:44:29] <Streemo> oppistely?
[04:44:32] <Streemo> sorry my order was wrong
[04:44:33] <Streemo> o-o
[04:47:32] <joannac> well, 2 queries is always going to be slower than 1
[04:47:39] <joannac> but indexing a large array also sucks
[04:47:52] <joannac> you'll need to decide which one works better for your data
[04:58:10] <Streemo> iight
[04:59:34] <Streemo> joannac: i think if i were to plot the number VS. length of arays, the distribution would be peaked around 5-10, which isn't too bad.
[04:59:48] <Streemo> the occasional outlier wont make a huge diference
[05:02:55] <Streemo> joannac: what do you think is better? Doing a single query on 10,000 objects with 10-15 single value fields (plus the one array field) having to shuffle through a len = 10 array per object OR: Doing one query on 100,000 very small objects with only two single value fields, and then another query on 10,000 objects only going through single value fields?
[05:03:44] <Streemo> 10,000 length 10 arrays queried once, VERSUS 100,000 single fields queried and then 10,000 single fields queries. pretty much
[05:05:25] <joannac> why would you need to shuffle through a length 10 array?
[05:05:39] <Streemo> that's the instance id thing we talked about
[05:05:47] <Streemo> each object has ~10 instances
[05:05:48] <joannac> you would just index it, surely
[05:05:52] <Streemo> some have 0-1 some have 500
[05:06:03] <Streemo> what do you mean
[05:06:27] <Streemo> what would i index
[05:06:48] <Streemo> with which data architecture? norm or denorm?
[05:11:51] <zerOnepal> all, why one get Segmentation fault (core dumped) when doing mongo login from shell ??
[05:14:08] <joannac> zerOnepal: erm, need more info
[05:14:13] <joannac> Streemo: index the array field?
[05:14:24] <zerOnepal> my dmesg log look like this
[05:14:24] <zerOnepal> [42705344.027400] mongo[21542]: segfault at 1 ip 00000000008bd4c8 sp 00007fffad740d70 error 4 in mongo[400000+af1000]
[05:15:00] <zerOnepal> Streemo what do you mean "index the array field ?"
[05:15:15] <joannac> zerOnepal: what version of the shell?
[05:15:54] <zerOnepal> joannac: even this gives me segfault
[05:15:57] <zerOnepal> mongo --version
[05:15:57] <zerOnepal> Segmentation fault (core dumped)
[05:17:17] <joannac> zerOnepal: o.O
[05:17:25] <joannac> maybe uninstall and reinstall
[05:17:53] <zerOnepal> its in production, joannac brother
[05:19:26] <joannac> zerOnepal: and? your mongo shell doesn't work.
[05:21:55] <zerOnepal> oh yeah joannac
[05:22:22] <zerOnepal> but I can still connect from remote
[05:23:09] <joannac> yes, because you're on a different machine with a different install of the mongo shell
[06:06:44] <zerOnepal> joannac: found the solution, it was due to LD_PRELOAD I exported for boosting my memory allocation for ruby process
[06:06:54] <zerOnepal> thanks, anyway
[06:18:53] <NoOutlet> Hello.
[06:19:08] <joannac> hi
[08:42:35] <amitprakash> Hi, while reading on http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1/2/3 .. they've mentioned two way referencing
[08:42:55] <amitprakash> however, how does two way referencing handle the problems that might arise due to race condition..
[08:44:42] <amitprakash> i.e say user1 want to assign a task X to person Y, while user2 wants to assign task X to person Z, if we update as set(task.user); set(user.task).. then the order of execution could follow the pattern set(taskT.user=X); set(taskT.user=Y); set(userY.task=T); set(userX.task=T)=
[08:45:29] <amitprakash> is there a way to ensure transactions while updating two way referenced documents
[08:48:47] <kali> amitprakash: nope, there is no built-in multidoc strategy. but in your case, i don't think it is an issue: update the task collection first, while checking that the task is not allocated. then, and only if it worked, update the denormalization in the user collection
[08:50:10] <kali> amitprakash: basically, as long as you express the invariant "a task is assignd to 0 or 1 user" as something that is verifiable on one single document, it stays relatively easy
[08:51:23] <amitprakash> kali, the issue is with query statements, the conditions to add might not be ass simple as checking if taskT has a user assigned but significantly complex
[08:52:29] <amitprakash> putting the condition as the query param might either be cpu intensive or not possible due to external dependencies
[08:52:47] <kali> cpu intensive: no, it's not.
[08:53:48] <amitprakash> kali, a condition could be something of the sort as if there are no users with this count of this tasktype assigned > some number
[08:54:33] <amitprakash> i.e. only assign this "easy" task to a user if this task has no users and there are no users with more than 50 "easy" tasks
[08:55:26] <amitprakash> there are around 30m users, so doing that count aggregation takes time
[08:56:34] <kali> you need to denormalize that kind of thing anyway. you need to maintain an easy task counter on the user collection with an index on it.
[08:56:57] <kali> bottom line is: mongodb provides no ready-to-use tool to maintain multidoc invariant. you have to do it on your own
[08:57:14] <amitprakash> and I cant lock individual documents either
[08:57:17] <kali> (there is one exception, it's the "unique" index)
[08:57:37] <kali> amitprakash: you can't do it at the mongodb level, but you can implement a lock on top of it
[08:57:59] <kali> amitprakash: but with this type of constrait, you'll have to wonder if you have picked the right database for the job
[08:58:33] <kali> amitprakash: https://blog.codecentric.de/en/2012/10/mongodb-pessimistic-locking/
[08:59:33] <kali> amitprakash: everything from pessimistic locking to two phase commit can be implemented on top of mongodb, but it's up to the application
[09:12:42] <FlynnTheAvatar> Hi, I have issues install mongodb-org 3.0.0-2 and 3.0.1-0 with yum - error is mongodb-org conflicts with mongodb-org-server-3.0.0-2.el7.x86_64
[09:43:36] <Stiffler> hello
[09:43:45] <Stiffler> how to store objects in object in mongo
[09:45:57] <Derick> Stiffler: by nesting them? Not sure what you're trying to ask, but this works: db.collection.insert( { top_level_field: { nested_object_field1: 4, nested_object_field2: 'foo' } }
[09:46:00] <Derick> );
[09:46:34] <Stiffler> hmm yes something like this
[09:47:24] <Stiffler> can I do? db.collection.insert({top_level_field: { nested_object_field1: { xxx:1,yyy:2,etc:4 } }})?
[09:47:36] <Derick> sure
[09:47:56] <Stiffler> then will I be able to easly do queries on nested objects?
[09:48:15] <Stiffler> so what type in scheme should be top_level_field? \
[09:48:18] <Derick> you will be able to do queries against nested fields, but the return will always be the full document
[09:48:25] <Derick> MongoDB doesn't have a schema
[09:48:33] <Derick> ... you mean with an ODM?
[09:48:38] <Stiffler> ye with moongose
[09:48:50] <Derick> I'm afraid I don't know mongoose.
[09:48:54] <Stiffler> ah ok :)
[09:49:17] <Stiffler> thanks for info
[09:50:12] <Derick> http://mongoosejs.com/docs/guide.html under "Defining your schema" -- the example shows it
[09:50:19] <Derick> in the "meta" field
[09:51:05] <Derick> and "comments" too, but as an array
[09:51:15] <Stiffler> ok thanks
[09:51:23] <Stiffler> this is what i was looking for
[10:03:44] <Petazz> Hi! How could I $group documents by their objectId timestamp?
[10:04:21] <Petazz> I'm trying something like $group: { _id: {day: id._getTimestamp()} } which does not work
[10:07:32] <Petazz> And using {day: $id.getTimestamp() } just says that $id is not defined
[10:15:47] <Stiffler> bascily this moongose is completly shit
[10:15:57] <Stiffler> it usuless
[10:16:07] <Stiffler> I cant find simple insert in moongose
[10:16:18] <Stiffler> but I know how to do it in pure mongo
[10:16:21] <Stiffler> doc is shit
[10:16:25] <Stiffler> conception is shit
[10:20:09] <Derick> heh
[10:20:13] <Derick> i wouldn't go that far
[10:20:23] <Derick> I'm sure it does what some people expect from it
[10:20:30] <Derick> I'm more of a "pure mongo" person myself though.
[10:21:43] <Derick> Petazz: I don't think you can easily group by the timestamp embedded in the _id field - unless you use aggregation I think
[10:21:44] <Stiffler> I would be,but I started using moongose istead of pure mongo
[10:21:55] <Stiffler> and that was mistake in this project
[10:22:00] <Stiffler> im not gonna do this again
[10:54:44] <d0x> when you connect hadoop with mongodb directly, will hadoop get the data directly from the shards or will it pass through mongos?
[10:55:34] <Derick> it will have to go through mongos
[10:55:38] <d0x> From the doc i got this: In parallel, Hadoop nodes pull data for their splits from MongoDB (or BSON) and process them locally
[10:55:42] <Derick> otherwise it might see duplicated data
[11:04:02] <d0x> That means if i have to process 5tb of data, i have to pass it through a single mongos node?
[11:04:36] <d0x> (or can every hadoop node have an own mongos?)
[11:06:33] <Derick> theoretically yes, but I don't know how the connector works
[12:33:06] <Stiffler> hi its me again
[12:33:15] <Stiffler> i would like to do query db.stops.find({'lineNr':'30','stopNr':10,'dirNr':12868,'timetable.type':"15"})
[12:33:21] <Stiffler> but timetable is array
[12:33:23] <Stiffler> and it doesnt work
[12:33:39] <Stiffler> it show also timetable.type 16 18 and more
[12:34:31] <cheeser> pastebin your output
[12:35:22] <Stiffler> not enought space in terminall to get all results ;/
[12:35:38] <cheeser> just the first few docs is fine
[12:37:11] <Stiffler> http://pastebin.com/ZpeXZAE6
[12:37:46] <Stiffler> i would like to get only array field where type is 15
[12:38:10] <cheeser> i don't think you can do that. the query matches a document and returns all of it unless you define some projections.
[12:38:24] <cheeser> but i don't think you can only get the elements that match the type
[12:39:10] <cheeser> maybe if you add a projection of : { "timetable.$" : 1 }
[12:39:13] <cheeser> *maybe*
[12:45:03] <Stiffler> doesnt work
[12:45:28] <Stiffler> and bascily what for are subdocuments?
[12:45:32] <Stiffler> if I cant filter them
[12:46:52] <cheeser> um. you can query against them. you can only the subdoc if you want. but that'd be at the timetable level because that's the 'doc' in question.
[12:50:14] <Stiffler> if I can, so how to get array field contains only type equal to 15
[12:50:18] <Stiffler> it sounds easy
[12:50:36] <Stiffler> and should be done easly, otherwise mongo wouldbe usless
[12:51:42] <cheeser> it would hardly be useless just because it doesn't fit one corner case.
[12:52:01] <cheeser> you could use an aggregation. that would probably work.
[12:52:10] <cheeser> unwind the array. match the elements you want.
[12:53:31] <StephenLynx> oy
[12:53:38] <cheeser> what up, yo?
[12:53:40] <StephenLynx> what you guys discussing?
[12:53:43] <StephenLynx> sub arrays?
[12:54:04] <cheeser> http://irclogger.com/.mongodb/2015-03-11
[12:54:22] <StephenLynx> :^)
[12:56:32] <Stiffler> ye subarrays
[12:56:56] <StephenLynx> they are not useless, but for sure they have their limitation
[12:56:59] <StephenLynx> s*
[12:58:07] <cheeser> i actually don't use arrays that much as their a bit wonky to work with but i have to fix some morphia bugs around them so off i go. :D
[13:16:53] <pamp> Hi, what is the hardware requirements for mongos and config server's in a production cluster.. RAM and HD
[13:54:44] <wrighty> Hello
[13:55:09] <wrighty> I'm trying to install MongoDB 3 - but am getting errors - using YUM on Centos
[13:57:15] <StephenLynx> what error?
[13:57:29] <StephenLynx> I installed mongo 3 on a centOS 7 machine.
[13:58:11] <wrighty> Getting conflicts with mongo-10gen-server
[13:58:37] <wrighty> running on Centos 6.6
[13:59:12] <wrighty> I've tried what most stackoverflow methods say - which are running yum makecache
[13:59:40] <StephenLynx> I didn't used yum makecache
[13:59:50] <StephenLynx> I just setupd the repo file and installed mongo org
[13:59:57] <wrighty> That's what I've tried - but it's failing. :
[14:00:20] <StephenLynx> http://docs.mongodb.org/manual/tutorial/install-mongodb-on-red-hat/ tried this?
[14:00:21] <wrighty> I set up the repo file in the /etc/yum.repos.d - but then when running yum instal mongodb-org I get the error.
[14:00:33] <wrighty> That's what I did.
[14:00:54] <StephenLynx> do you have any mongo package installed?
[14:01:22] <wrighty> Nope. Brand new install of CentOS
[14:01:27] <wrighty> I thought it might be a conflict - but it's not
[14:01:31] <wrighty> Wrongly reporting it for some reason.
[14:06:21] <FlynnTheAvatar> It is being worked on: https://jira.mongodb.org/browse/SERVER-17517
[14:07:19] <FlynnTheAvatar> as a workaround install 3.0.0-1:
[14:07:26] <FlynnTheAvatar> yum install -y mongodb-org-3.0.0-1.el7.x86_64 mongodb-org-mongos-3.0.0-1.el7.x86_64 mongodb-org-tools-3.0.0-1.el7.x86_64 mongodb-org-server-3.0.0-1.el7.x86_64 mongodb-org-shell-3.0.0-1.el7.x86_64 mongodb-org-tools-3.0.0-1.el7.x86_64
[14:07:59] <cheeser> look at the status...
[14:08:37] <FlynnTheAvatar> (replace el7 with el6 for centos 6.6)
[14:09:07] <wrighty> Aye, that's working. Slightly annoying that these issues have made it up to the release. :(
[14:09:13] <wrighty> Ah well - thanks for the solution! :D
[14:30:51] <pamp> Hi, what is the hardware requirement for config server and mongos(routers) in a prodution cluster?
[14:32:03] <pamp> I'm planning a cluster with two shard with on Replica Set each
[14:33:06] <pamp> what's the best approach? use 7 machines, 2 shards two routes 3 config server, or put on the same server one shard one route and one config server
[14:33:08] <cheeser> config servers don't need much
[14:34:08] <pamp> dont need much, like what ? in RAM CPU and HD
[14:34:19] <cheeser> yeah. those are pretty light.
[14:36:38] <pamp> an Azure virtual machine instance A0, with 1 core, 0,75 GB RAM and 20 GB disk size is enough?
[14:36:48] <pamp> for example
[14:37:09] <cheeser> http://docs.mongodb.org/manual/administration/production-notes/#hardware-considerations
[14:56:30] <crised> Do I need mongodb replication when using mongodb AMI in EC2?
[14:56:41] <crised> e.g. MongoDB with 1000 IOPS
[14:56:45] <crised> it's a very simple app
[14:57:15] <GothAlice> crised: If you want high-availability (i.e. if one DB host goes down, your whole app doesn't go down) you'll need at least two DB hosts and an arbiter that could live on the application host.
[14:57:37] <GothAlice> Otherwise, have a really good backup plan. ;)
[14:57:56] <crised> GothAlice: data won't be loss, right?
[14:58:04] <crised> since it's backed by EBS
[14:58:57] <GothAlice> crised: There is the possibility for data loss, yes. If your application is in the middle of doing something with the data (i.e. saving a user's changes) and the DB host goes away, so does the in-progress change.
[14:59:30] <crised> GothAlice: ok.... What other option do I have?
[14:59:33] <crised> Mongolab?
[14:59:43] <GothAlice> A simple replica set. How much data do you have?
[15:00:12] <crised> GothAlice: very few, less than 100 MB
[15:00:32] <crised> GothAlice: A simple replica set, does this mean 3 ec2 large instances?
[15:00:39] <GothAlice> Heck no.
[15:00:50] <GothAlice> That'd be extreme overkill for your data needs.
[15:00:50] <crised> GothAlice: please expand that
[15:00:55] <crised> GothAlice: then what
[15:01:10] <GothAlice> You need as much RAM as you have data, + about 20%.
[15:01:20] <GothAlice> So, even a tiny instance could work, here.
[15:01:29] <crised> GothAlice: micro instances?
[15:01:34] <GothAlice> Yeah, that. ^_^
[15:01:39] <GothAlice> It's been a while since I AWS'd.
[15:01:51] <crised> GothAlice: so I need at least 3 micro instances, and set up a replica se
[15:01:53] <crised> set
[15:01:59] <GothAlice> Well, two for the DB, one for your app.
[15:02:39] <crised> GothAlice: What if I expand later.... and want to have multiple light apps
[15:02:56] <crised> Shouldn't it be better two have 3 instances?
[15:04:00] <GothAlice> Well, for redundancy you only really need two copies of your data. For reliability (high availability) you'd need to add an arbiter to the cluster, but an arbiter doesn't really need its own host. All it does is vote, not store data.
[15:04:59] <crised> GothAlice: vote?
[15:06:10] <crised> GothAlice: isn't there a turn key solution for this?
[15:06:21] <GothAlice> Indeed. When you have a replica set with two nodes and one of them dies, the other needs some way to determine if _it_ is encountering the network failure, or if the other host is the one flaking out. An arbiter is a "neutral third party" that lets the remaining node know. (I.e. a node needs to see > 50% of the other nodes for it to become primary in the event of a failure.)
[15:06:28] <GothAlice> crised: https://mms.mongodb.com/
[15:06:31] <GothAlice> Free for up to 8 hosts. :)
[15:07:01] <GothAlice> (It can even spin up AWS nodes for you, but those up to you to pay for. It's like a "bring your own host" service.)
[15:07:35] <GothAlice> s/up to/are up to/
[15:07:35] <cheeser> mms++
[15:07:51] <GothAlice> Even I use MMS on some projects. ^_^
[15:08:06] <cheeser> let's get the rest of those on board. ;)
[15:08:50] <crised> GothAlice: so this service is free under 1GB
[15:09:24] <crised> GothAlice: But I need to pay for the instances
[15:10:16] <cheeser> if you use AWS, yes.
[15:10:33] <cheeser> you can run it on local hardware, too.
[15:10:36] <crised> cheeser: hi there
[15:10:40] <cheeser> what up, yo?
[15:11:07] <crised> cheeser: you are cheeser from #java :)
[15:12:06] <cheeser> i'm everywhere!
[15:12:09] <crised> so mms will need at least 3 instances?
[15:12:19] <crised> cheeser: cool, I thought ##java was your cup of tea
[15:12:27] <cheeser> you can run a single node via mms
[15:12:33] <cheeser> crised: i'm a coffee guy
[15:12:43] <crised> cheeser: a single node, a single instance?
[15:12:44] <crised> :)
[15:14:21] <crised> cheeser: could you explain that? would there be HA if I only have one node?
[15:14:28] <cheeser> there wouldn't be
[15:14:40] <cheeser> but that's not what you asked. :)
[15:15:00] <crised> cheeser: then what would be the use of mms?
[15:15:23] <cheeser> mms is a management tool...
[15:15:26] <crised> in that case
[15:15:45] <cheeser> you'd still need to manage your database...
[15:15:58] <cheeser> there's monitoring, backup. upgrades. downgrades!
[15:16:16] <cheeser> all sorts of stuff to do irrespective of HA
[15:17:35] <crised> cheeser: what if I just throw a ec2 micro instances and do yum install mongodb, and that's it
[15:19:50] <cheeser> what about it?
[15:20:06] <crised> cheeser: so mms provide option for standalone instance and replica set
[15:20:25] <crised> cheeser: so in replica set mms, how many instances does mms needs?
[15:20:49] <cheeser> however many you want.
[15:20:58] <cheeser> you could deploy your entire replSet on one machine
[15:22:28] <crised> cheeser: what does make sense? for light redundancy?
[15:23:01] <cheeser> one host per replSet member. otherwise it's a bit of a waste
[15:23:55] <crised> cheeser: one ec2 instance per replica set?
[15:24:06] <cheeser> yes
[15:24:13] <cheeser> MMS can spin them up for you.
[15:24:53] <crised> cheeser: since it's EBS backed then, there is no chance of data loss?
[15:25:57] <cheeser> be wary of absolute claims :)
[15:26:11] <crised> cheeser: yes, but it *should* be no data loss
[15:26:44] <cheeser> that's the idea at least, yes
[15:26:57] <GothAlice> crised: No. Any time you're dealing with moving parts, there's always risk. Having a replica set reduces the risk of catastrophic failure to scenarios where basically everyone in the same zone is having issues. We don't use AWS at work because it ate our data and required 36 hours of reverse engineering corrupted files to recover after a multi-zone cascade failure.
[15:27:15] <cheeser> ouch
[15:27:18] <GothAlice> :thumbsup: AWS
[15:27:29] <crised> GothAlice: What do you use?
[15:27:47] <GothAlice> SLA said "use multiple zones, you'll be safe". We learned that the SLA isn't always right. ;)
[15:27:54] <GothAlice> crised: Rackspace, these days.
[15:28:16] <crised> cheeser: so the cheapest I can go is use m3.medium at $0.07 / hour
[15:28:28] <kali> it's been a while since last multi zone us-east1 event, though
[15:28:29] <GothAlice> crised: We're also strange people, over here. We don't have permanent storage on our database hosts. ¬_¬
[15:28:49] <crised> GothAlice: *sighs* maybe it's better just to use dynamodb
[15:29:03] <kali> or rds :)
[15:29:05] <crised> at least simpler
[15:29:19] <GothAlice> crised: It's just that you keep making absolutes, such as "no chance of data loss". ;)
[15:29:34] <GothAlice> There's _never_ "no chance of data loss", regardless of database engine.
[15:29:53] <crised> GothAlice: yes, right, no absolutes in life, I agree
[15:30:05] <kali> no absolute in risk management
[15:30:19] <kali> in life... YMMV
[15:32:15] <crised> GothAlice: since mms can run in ec2, the cheapest I can go is m2.large, so it's at least $50 / mo
[15:32:51] <cheeser> i just spun up something on openvz.io. $6ish/month
[15:33:10] <kali> a rpi ?
[15:33:10] <cheeser> but if it's for a serious business, you'll want to spend decent money
[15:33:14] <crised> cheeser: mmmm, worried about that uptime
[15:33:32] <crised> cheeser: maybe it's easier to just use compose.io or mongolab...
[15:33:36] <crised> and maybe even cheaper
[15:34:40] <crised> cheeser: maybe even something like heroku
[15:35:02] <cheeser> and you want uptime you say? :D
[15:35:31] <crised> cheeser: heroku is bad uptime?
[15:35:34] <crised> or compose.io>?
[15:36:55] <cheeser> you wouldn't need mms with compose
[15:37:24] <crised> cheeser: yes exactly
[15:37:41] <crised> cheeser: it's only $18 mo
[15:42:41] <bagpuss_thecat> afternoon all
[15:46:12] <NoOutlet> GothAlice, were you able to track down that kernel panic?
[15:50:21] <bagpuss_thecat> how long does it take the MMS Automation agent to discover host details and mongod instance details?
[15:51:13] <cheeser> for bootstrapping?
[15:54:02] <bagpuss_thecat> cheeser: I believe so... all I have is are two hosts, and I can't seem to deploy the monitoring agent to them
[15:54:30] <bagpuss_thecat> I have two uunpublished changes, but I always get "Another session or user has already published changes" when I try to confirm them
[15:54:39] <cheeser> strange
[16:00:06] <bagpuss_thecat> hmmmm
[16:00:25] <bagpuss_thecat> "Error sending status to MMS: Error POSTing to https://mms..."
[16:02:36] <bagpuss_thecat> Discarded my changes, and now I'm back at /setup/onboarding, with a verified automation agent
[16:02:56] <bagpuss_thecat> however, the continue button is unavailable, even though it says "Successfully verified"
[16:03:06] <bagpuss_thecat> bring back the old MMS :-/
[16:05:38] <crised_> mongolab or compose.io?
[16:09:53] <crised_> Dedicated mongod (your own server process)-> Does this means I can create as many db as I want?
[16:31:15] <cheeser> /1
[16:38:30] <Derek57> Hi all, had a question. Has anyone been having an issue with Wiredtiger and memory? On the previous DB type, I was able to run a server with 2 compounds and 2.5 million documents on a 48GBs server. Now with Wiredtired, I'm maxing out 112GB of RAM and 16GBs of swap, with intermitted crashes from segmentation faults, and half the time impossible to run .count().
[16:39:09] <Derek57> And not sure if that last message sent. Had a bit of an issue re-setting up my IRC client.
[16:39:38] <cheeser> https://jira.mongodb.org/browse/SERVER-16623?jql=text%20~%20%22wiredtiger%20memory%22
[16:40:07] <cheeser> slightly cleaner: https://jira.mongodb.org/browse/SERVER-17542?jql=status%20in%20%28Open%2C%20%22In%20Progress%22%29%20AND%20text%20~%20%22wiredtiger%20memory%22
[16:44:12] <Derek57> Ah... Well that all looks very familiar to my recent issues. A bit relieved that it wasn't something I had misconfigured.
[16:53:27] <GothAlice> NoOutlet: Alas, no, couldn't reproduce. I'm actually suspecting a bug in VirtualBox's SATA implementation. :/
[16:59:40] <Derek57> Oh. I'm running it currently in Azure, possibly that could be related?
[16:59:57] <Derek57> As it's not on a dedicated box.
[17:10:36] <GothAlice> Derek57: Alas, my issue is unrelated to yours. Mine is about high-stress benchmarking (hosing a node for ten minutes of multi-megabyte-per-second mixed find() and update() operations) killing the VM. ;) I am using WiredTiger, though.
[17:14:08] <Derek57> Fair enough! I may switch back to mmapv1 for the time being, and will do more research in to why I may be having the issue. Just wanted a quick look to see if anyone else was having something related. :)
[18:36:48] <Petazz> Hi! I'm trying to calculate how many users there were in a db grouped by the date saved in a users ObjectId. I'm trying to do this with mapReduce but not really getting a sane result. Why? http://pastebin.com/GiTsh4mq
[18:37:18] <Petazz> I know there is now return 1, but even then the result is not 1 fore each day
[18:40:00] <cheeser> use aggregation
[18:40:18] <kali> yeah.
[18:41:20] <kali> apart from that, you're doing the most common mistake with map/reduce. reduce will be called 0 to N time for one given key. so the map output must match the "form" of your expected value
[18:41:55] <kali> and reduce can only reduce: its output has the same "form" than the map output, and the items in the values array have that very same form too
[18:43:05] <kali> so you need to emit(id,1) in map, and reduce must return the sum of the integers it will find in values
[18:43:17] <kali> but use aggregation pipeline anyway
[18:44:29] <Petazz> Hmm ok, this was basically an excercise to try figure out how mapReduce works. So map is called once per every document and then reduce is called N times for a key?
[18:45:28] <GothAlice> Petazz: https://gist.github.com/amcgregor/1623352 was the first map/reduce I ever wrote that worked. ;)
[18:45:29] <kali> Petazz: 0 if there is only one mapped document for one given key, and then by batch of 1000 iirc
[18:46:26] <Petazz> Ah ok so it is called once per key, but the array is limited to 1000?
[18:47:04] <kali> nope. if there are 1010 key, you'll get called once for the first 1000 keys, then one for the resulting value plus the remaining 10 keys
[18:47:38] <kali> or any other combination. it's whatver mongodb decides
[18:49:41] <Petazz> Ok so this is basically for sum operations
[18:49:52] <Petazz> Since the array can be handled "recursively" if you will
[18:50:07] <GothAlice> Petazz: Well, map/reduce has a number of applications.
[18:50:18] <kali> yeah, anything associative
[18:50:25] <kali> and commutative, i guess
[18:51:43] <GothAlice> Wow, that last search of mine returned some odd results for "strange uses for map reduce". Like, "10 strange uses for blood", and even less safe-for-work entries. Thanks, Google.
[18:54:17] <Petazz> Let's say I wanted to have a document that holds the number of users per day so that I don't have to calculate the historic data again, what would be the best way to do it?
[18:54:32] <kali> aggregation
[18:54:38] <GothAlice> Even better: pre-aggregation.
[18:54:43] <kali> or maintain it on the fly
[18:55:09] <GothAlice> Petazz: http://www.devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework covers a number of methods of storing historical analytics data and covers their impact in performance and disk space.
[18:55:52] <GothAlice> (We do click and other event tracking at work. This article was invaluable when it came time to model our own data.)
[18:56:02] <Petazz> Cool!
[18:56:24] <Petazz> Yea calculating them on the fly with a cron for example is definately the best way
[18:56:33] <GothAlice> Well, no.
[18:56:33] <Petazz> Or is it?
[18:56:49] <GothAlice> If you do bulk processing under a cron job you're placing intermittent high load on your infrastructure.
[18:57:14] <GothAlice> You're also delaying your metrics by, potentially, the full time period between cron runs. Our analytics (http://cl.ly/image/142o1W3U2y0x) are live. :)
[18:58:14] <Petazz> Nice. Haven't read much about aggregation with mongo yet :/
[18:58:33] <GothAlice> (Using pre-aggregation as described in the article I linked. When saving the Hit document, I "upsert" an Analytic document for the relevant combination of hourly time period and other criteria like invoice, company, job, etc.) We only keep Hits around for a short time using a TTL index that automatically cleans up old (by time) data.
[19:00:39] <GothAlice> Hits are accurate to the millisecond, but we only keep a week of them around. Analytics are only hour-accurate. https://gist.github.com/amcgregor/1ca13e5a74b2ac318017 is an example Analytic document and aggregate query to give the two-week comparison line chart from that dashboard screenshot I linked.
[19:13:35] <rpcesar> hello. I have used mongo extensively for many applications in the past, I am used to it and I love it, however I am entering a situation where I am going to be severely memory (RAM) constrained on a project. Is it , My working set of data will never fit and "moar ram" is not an option. I guess what I want is to keep indexes in RAM, let the data itself primarally fetch off disk. I am worried about the performance profile of mongo pul
[19:14:50] <rpcesar> is there any way to force/configure mongo to keep the indexes, or at least primary index, in ram at all times, or pending that is there an alternative database that better fits this performance profile?
[19:17:13] <GothAlice> rpcesar: The default mmapv1 backend relies on the system's kernel to best handle paging blocks from disk into RAM. I don't even know if there is a least-recently-used cache, or what mechanism it would use to determine which pages can be swapped out. AFIK there is no way to keep something in RAM within MongoDB except by regularly priming it (i.e. making a query that requires running the entire index.)
[19:18:06] <GothAlice> Also, as a note, in that situation any query that can't be fully covered by an index (i.e. needs to compare values within the documents) will be ludicrously slow.
[19:19:21] <rpcesar> this will be primarally used as a K/V store , so that's not really a problem (I would basically use a big slave occasioanlly to do aggragates, but everything else would be ObjectID / Natural Index only)
[19:19:58] <rpcesar> but yea, do you know any database that better fits the performance profile I am talking about, preferably one that plays somewhat nice with mongo?
[19:20:46] <GothAlice> rpcesar: What's your write load going to be like?
[19:21:57] <GothAlice> Also, there's an optimization that may save some space, if you have lots of records: use single-letter field names. Since MongoDB needs to store the field names within each document, the space allocated to these names can add up quickly if you many small documents.
[19:22:00] <rpcesar> expecting about equal reads and writes.
[19:22:14] <GothAlice> s/you many/you have many/
[19:23:07] <GothAlice> https://en.wikipedia.org/wiki/Berkeley_DB might be a candidate (it's a K/V store), but I'm currently unaware of how it operates in constrained environments.
[19:23:51] <rpcesar> ive looked at levelDB and Tokyo Cabinate, using redis as an external index. In all these cases though, they seem to suck up all the ram they can.
[19:24:13] <GothAlice> Indeed, most database engines will do that.
[19:24:14] <rpcesar> what I am trying to do here is have a custom index I cannot express easily in mongo or other situations residing in redis.
[19:24:29] <GothAlice> :|
[19:24:29] <rpcesar> but I need something that plays nice with it ram wise
[19:24:53] <rpcesar> I want good data locality here, but I need nothing more then a KV store on the persistance side
[19:25:04] <GothAlice> In the two projects I've been involved with in recent years that had Redis, I've been able to replace both with plain MongoDB.
[19:25:23] <rpcesar> dealing with graph structures for bio-engineering.
[19:25:40] <GothAlice> And you have a constrained environment to run in? ;^P
[19:26:01] <rpcesar> yup :)
[19:26:04] <GothAlice> (Same with projects involving ZeroMQ/RabbitMQ. And memcache/membase.) For graphs, though, you should invest in a real graph database.
[19:26:13] <GothAlice> Rather than attempting to mash several non-graphs together.
[19:26:19] <rpcesar> ive looked into a few in that area.
[19:26:46] <rpcesar> Neo4J being one of them, their idea of what consitutes a graph and mine were somewhat different.
[19:27:15] <jclif> Hi all. We're trying to set up a sharded cluster from an existing replica set, and were wondering about a few details. We have around 1.5 TB of data in our current setup, and plan to connect this replica set to a cluster of 5 additional replica sets with 2 400GB instances. The idea will be to retire our initial replica set after the data has been migrated. Is this possible?
[19:28:05] <Boomtime> 1.5TB of data, across how many collections? how big is the biggest collection?
[19:28:34] <jclif> the biggest collection is around 800 gb
[19:28:39] <jclif> then 300
[19:28:43] <jclif> then some smaller ones
[19:28:58] <GothAlice> jclif: Yes. Initial chunk migration might take some time with that much data, though. A big consideration: sharding key. With a bad key, records won't evenly distribute amongst the nodes.
[19:29:23] <GothAlice> (And having MongoDB try to dump 800GB into a 400GB shard is clearly a no-go. ;)
[19:30:21] <Boomtime> 800GB collection will be very difficult to shard
[19:30:21] <Boomtime> http://docs.mongodb.org/manual/reference/limits/#Sharding-Existing-Collection-Data-Size
[19:30:59] <jclif> yeah, thats been a big issue for; we went into the mongo office in nyc for help on choosing a sharding key, but the consensus seemed to be that we needed to just use a shard key with the id
[19:31:08] <jclif> is it not possible to shard with a collection of that size?
[19:31:38] <Boomtime> please follow the link
[19:31:53] <Boomtime> you may be able to do it by increasing the chunk size first
[19:32:20] <GothAlice> jclif: That'd give uniform distribution amongst nodes, without any concern for data locality (improving query performance). Looks like the over-large collections would need to be dumped, cleared, sharded, then restored, or values tuned as Boomtime suggests.
[19:33:47] <Petazz> I guess there's no way to run custom javascript within the aggregation framework?
[19:34:07] <Petazz> Just found the jira ticket for my specific problem: https://jira.mongodb.org/browse/SERVER-9406
[19:34:26] <GothAlice> Petazz: Yeah, welcome to the club, on that ticket.
[19:34:58] <GothAlice> Petazz: The point of the aggregate framework is to get away from needing to spin up the JS VM.
[19:35:13] <GothAlice> (Aggregate queries are faster than the equivalent map/reduce in every case I've tested.)
[19:35:40] <Petazz> Yea, found that reason too. I guess the operators are native from C++ so they should run faster
[19:42:20] <daaku> we're hitting a panic in the mgo driver where the server is returning more documents than numberToReturn in an OP_GET_MORE. the docs at http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol/ are not entirely clear, but it seems like the server is allowed to do this?
[19:42:58] <daaku> (we're getting this in less than 1% of our queries, so it's quite rare)
[19:43:55] <jclif> i mispoke; our largest db is 800gb; the largest collection is 350gb. with our key being so small, and chunk size correctly set, given that we can properly shard this collection, how would one handle the growth of such a collection?
[19:44:04] <jclif> @Boomtime
[19:45:45] <Boomtime> jclif: once a collection is sharded, growth in that regard is not a problem
[19:46:00] <GothAlice> It's dividing up the initial data that's the problem.
[19:46:11] <Boomtime> what aspect of a growing collection concerns you? size of the shards for provisioning? something else?
[19:49:34] <fewknow> GothAlice: whats up
[19:49:44] <fewknow> jclif: you can shard using mongo-connector
[19:49:47] <GothAlice> fewknow: Looks like you got nickserv working. \o/
[19:49:50] <fewknow> any size data with no downtime
[19:49:57] <fewknow> https://github.com/10gen-labs/mongo-connector
[19:50:27] <fewknow> jclif: growth is handled by sharding...that is the point...you can add more shards
[19:50:53] <fewknow> GothAlice: yeah finally
[19:51:01] <jclif> i had not heard that there was an upper limit on the initial sharding of collections, so was just concerned
[19:51:07] <jclif> thanks
[19:51:17] <jclif> will look into mongo-connector
[19:51:32] <GothAlice> fewknow: Okay, mongo-connector is awesome. Not sure why I haven't cannibalized it earlier. (Dex is likewise awesome, and a 10gen tool I regularly use.)
[19:51:53] <fewknow> GothAlice: I have been using it for a while...even made contributions to the code.
[19:51:59] <fewknow> It is very powerful
[20:29:08] <delinquentme> with regards to mongo ( lulz, duh ) ... how do I bid out wether my particular use case ... can be well contained within a single server instance?
[20:29:32] <GothAlice> delinquentme: A good general rule is: can your data fit in the RAM of that single instance?
[21:16:10] <greyTEO> does $addToSet work on an array of ojects?
[21:16:28] <greyTEO> from my test, it does not validate the object
[21:16:47] <fewknow> validate?
[21:16:54] <fewknow> mongo doesn't do validation of data
[21:17:02] <greyTEO> comparison*
[21:17:04] <fewknow> it does work on array of sub documents
[21:17:38] <fewknow> own...to compare the object...gotcha
[21:17:48] <fewknow> i am not sure if it does .... never had use case
[21:17:59] <fewknow> why are you putting object in array?
[21:18:34] <greyTEO> I want to be able to query by object values. eg offices.name
[21:18:37] <fewknow> do you need array or can you just use subdocmuents
[21:19:01] <fewknow> that will be really slow in an array
[21:19:09] <fewknow> it will have to search the entire array
[21:20:17] <greyTEO> even if the array is small?
[21:20:25] <greyTEO> im open to suggestions..
[21:20:47] <greyTEO> I had it as a nested object with the ojectid as the key and the value as the object
[21:21:10] <greyTEO> but this only allows me to search by offices.{objectId}.name
[21:24:07] <fewknow> yeah but that will be much much faster
[21:24:16] <fewknow> for a small array sure....why even put it in an array?
[21:24:20] <fewknow> have a seperate collection
[21:24:24] <fewknow> each object is a document
[21:27:54] <greyTEO> fewknow, I am mainly doing it to denormalize my data
[21:28:35] <greyTEO> i have 3 collections. 1 contains all 3 as a complete object.
[21:29:22] <greyTEO> They are inserted/updated by Apache Spark. I wanted to nect the documents to avoid having refences and multiple lookups..
[23:03:55] <grazfather> Hey guys, I am looking for an operator like $addtoSet, but that only checks for a certain key. e.g. I have a list of simple dictionaries, and I want to make sure the _name_ is unique, not necessarily the whole dictionary
[23:05:39] <daidoji> grazfather: can you give an example?
[23:10:02] <ejb> Hello, I'm looking for some design advice. I'd like to build a simple product comparison / review engine. Products will have a variable number of attributes so mongo came to mind. Are there any frameworks out there that might help me with this concept?
[23:11:12] <ejb> As an example, consider bicycles. I would essentially have a giant table with all of the attributes that might matter to someone shopping for a bicycles: size, speeds, weight, price, etc.
[23:11:39] <grazfather> daidoji: Sure. I have a collection 'clients' which has a field 'logins'. Logins is a list of items whose schema is {'url':str, 'username':str, 'password':str}. I want to update a certain URLs username and password. 'push' will add a duplicate, and addtoSet will verify that url, u/n, and password all match
[23:12:08] <ejb> I'd like to be able to add products (and their attributes) through a simple UI. If there's already something out there that has the UI, even better.
[23:21:49] <daidoji> grazfather: and what end result are you looking for?
[23:22:20] <daidoji> ejb: hmm thats pretty vague. From a data modeling standpoint you might be able to do all that with Mongo
[23:22:52] <daidoji> ejb: UIs, frameworks, and picking all those things are a bit outside the purview of this channel
[23:23:16] <daidoji> ejb: if you're asking those questions my advice would be to pick any of them and start building what you have in mind and you'll learn as you go along
[23:24:24] <daidoji> ejb: so Rails or Django etc... for creating a web UI and frontend, model your data in Mongo etc.., and basically start building stuff and see what breaks
[23:24:37] <ejb> daidoji: yeah, I was hoping that keeping it vague would yield some creative ideas. I'm mostly looking to cut out the CRUD work and get right to the actual idea
[23:25:15] <ejb> daidoji: cool, I'm versed in django. Came into the mongo world sideways, via meteor
[23:25:51] <daidoji> ejb: roger, then I'd stick with Django (mongoengine which works mostly okay) and the Admin console to build stuff fast
[23:26:12] <daidoji> ejb: although in my experience if you want to go beyond the basics, admin console starts becoming quite a burden
[23:26:21] <ejb> daidoji: do you still get the admin ui when using mongo with django?
[23:26:32] <ejb> daidoji: yeah... grappelli even?
[23:26:39] <daidoji> ejb: if you keep a disciplined schema
[23:26:50] <daidoji> ejb: I've never used grappelli so I wouldn't know
[23:27:08] <daidoji> ejb: but I followed the tutorial using mongoengine
[23:27:16] <daidoji> and everything seemed to work pretty well
[23:28:18] <daidoji> ejb: GothAlice might have more info for you as she's much more well versed in all that than I probably am, so she might have more ideas for you
[23:30:17] <GothAlice> ¬_¬
[23:31:18] <GothAlice> Possibly with the fiery passion of a thousand burning hot Latino stellar bodies. (English sucks. "Stars" is ambiguous. "Suns" maybe?)
[23:31:34] <grazfather> daidoji: I want to be able to update a client entry s/t it'll add a new url to logins, but if a login with the same url (i don't care about u/n and password matching) exists, instead update that entry in the array
[23:32:09] <GothAlice> grazfather: I have a "LoginAttempt" collection for the purpose of capturing both failures and successes, for auditing. It simplifies what you are trying to do, a lot.
[23:32:55] <grazfather> I don't think that's applicable at all?
[23:33:38] <daidoji> GothAlice: yeah I'm not a fan either but ejb is familiar with it
[23:33:39] <GothAlice> Update-if-not-different and upserts.
[23:34:09] <GothAlice> "add a new, but if exists, update it" is the exact definition of an upsert. ;)
[23:34:55] <grazfather> Yeah but it is a field in an entry, not an entry in its own collection
[23:36:01] <daidoji> GothAlice: do you have any experiences in the best way to transfer data between Mongo instances (short of writing scripts)?
[23:36:23] <daidoji> GothAlice: use case is I'm using one instance as a Data Warehouse type endpoint and then need to occassionally replicate that over to Production
[23:36:41] <daidoji> I've been mongoexporting/importing --upsert but I was wondering if there was a better way
[23:36:49] <GothAlice> grazfather: http://docs.mongodb.org/manual/reference/operator/query/elemMatch/ with http://docs.mongodb.org/manual/reference/operator/update/positional/ will let you "update if it's there". You can then check for nModified/nUpdated, if zero, insert. But this introduces a race condition that won't exist if you pivot your data and turn those embedded documents into their own collection and use real upserts.
[23:36:54] <daidoji> (like to capture deletes from the Data Warehouse instance etc...
[23:37:19] <GothAlice> daidoji: I was introduced to https://github.com/10gen-labs/mongo-connector today.
[23:37:38] <daidoji> GothAlice: word, I'll check it out thanks
[23:37:40] <GothAlice> daidoji: It sounds like mongo-connector is pretty much what you're looking for.
[23:38:28] <daidoji> GothAlice: sweet, thanks
[23:38:39] <grazfather> GothAlice: cool thanks a lot
[23:38:44] <ejb> GothAlice: lol, I'd go with "Suns".
[23:38:46] <daidoji> do you work for 10gen?
[23:38:58] <GothAlice> daidoji: I do not. I'm just a guest moderator at the moment. ;)
[23:39:02] <ejb> GothAlice: Any ideas for my quest?
[23:39:18] <daidoji> haha thats what I thought, I only ask because you're a deep well of knowledge when it comes to Mongo
[23:39:28] <GothAlice> ejb: There's a term for the type of ranking and search system you're looking for.
[23:39:31] <GothAlice> Oh!
[23:39:31] <daidoji> GothAlice: they should really be paying you to evangelize
[23:39:45] <GothAlice> ejb: If you're using Python for this, there's a presentation I need to dig up for you.
[23:40:01] <GothAlice> daidoji: They do keep pestering me to hand them a CV. XP
[23:40:05] <ejb> GothAlice: I might be! I'd appreciate it
[23:42:37] <GothAlice> ejb: https://youtu.be/M9V1e-rG7VA?t=11m26s "Easy AI with Python", specifically, the "neural networks for data mining" part.
[23:43:00] <GothAlice> The presentation that got me started on adding AI bits to my Exocortex project. ;)
[23:43:38] <GothAlice> I just wish I could remember the term for the type of engine you're looking for, though. :(
[23:44:41] <ejb> GothAlice: ping me when you remember. I'm really looking for something simple though. I would be doing all the data entry
[23:45:06] <GothAlice> ejb: May I PM you?
[23:45:17] <ejb> yeah