[01:41:40] <Mark_> im guessing that things have changed over the years
[01:49:38] <digitalextremist> Mark_: the driver seems to be radically out of sync with the Wiki. I suppose I’ll post an issue if one hasn’t been made already
[01:50:07] <digitalextremist> … except there is no issue tracker. Sweet!
[01:50:18] <Mark_> since i was just playing it wasnt a big concern for me, i was just playing converting some nosql friendly stuff to mongo
[01:51:17] <digitalextremist> I have mongodb running in production and the latest gem was a breaking change in many areas :-/
[05:40:20] <NaN> (fedora 19) I created the .repo file and when I try to install it it says "No package mongodb-org available." is there any extra step to install it with yum?
[05:41:48] <ranman> NaN: what steps did you follow?
[05:42:08] <ranman> NaN: did you previously have the mongodb-10gen packages installed?
[05:46:40] <NaN> ranman: the .repo file is on the same dir as the other repos
[05:47:05] <weeb1e> Though top did not say all applications were using 64GB, so I can't say exactly how much ram mongo was using
[05:47:14] <ranman> weeb1e: you have 64GB of ram on that machine ? or did it just say 64GB of virtual RAM also, it should have been paged out by your operating system if the other databases
[05:48:19] <weeb1e> All applications accessing other databases were experiencing insane query times
[05:48:29] <ranman> I no longer understand what you're saying, hop in a PM if you'd like more help but what you're saying doesn't make a lot of sense to me sorry
[05:48:29] <weeb1e> Even with nothing touching the large collections database
[05:49:22] <weeb1e> I now know I cannot store that much data in mongdb in a useful way, and have to rethink things
[05:49:43] <weeb1e> Yeah it was just a mongo issue
[05:49:48] <ranman> so the next time something like that happens try to spin up mongoperf or run db.currentOp() to ascertain what exactly is taking up all that time
[05:50:01] <weeb1e> But I have a bunch of services that use mongo, and rely on a fast query time, so after a day it became very noticable
[05:50:03] <ranman> there's also a --slowqueries option that will log queries that take above a certain threshold
[05:50:29] <ranman> weeb1e: understood sorry that happened, if you still have the logs you can definitely get a better understanding of what was happening
[05:50:48] <ranman> NaN: is there an issue with the mongodb.repo file do you think?
[05:50:52] <weeb1e> Ok, once I rework this projects storage, I will pay better attention to performance once it is running again
[05:51:03] <ranman> NaN: I was just able to do this on a test box
[05:51:06] <NaN> ranman: yum clean all did the trick, it was a yum problem
[05:51:24] <weeb1e> I always knew I needed to change that collection, but by the time I got to it, it was already too large to migrate the data
[05:51:28] <ranman> NaN: just your friendly neighborhood nightowl
[05:52:08] <weeb1e> I feel a column store would be much better suited for stats storage, but when I tested putting some of the data into postgres, it used even more disk space than mongos millions of documents
[05:52:16] <ranman> weeb1e: yikes :/ that's rough. there really a lot of options you have to prevent stuff like that
[05:52:38] <ranman> weeb1e: event data (non-entity data) like stats is particularly well suited to document stores like mongodb IMO
[05:52:48] <NaN> weeb1e: are planing to use mongo again, for the same project?
[05:52:53] <ranman> and I like postgres just fine too :D
[05:53:06] <weeb1e> NaN: Yes, since I don't know of something else that can store the data more efficiently
[05:53:54] <weeb1e> I have a few integers that need to be recorded for ~20k servers every ~10 seconds
[05:54:12] <ranman> weeb1e: There's a really great presentation that mongodb has called Real Time Analytics With Bucketing by scott hernandez
[05:54:24] <ranman> *bucketing and pre-aggregation
[05:54:58] <ranman> but this presentation search on mongodb's website sucks.
[05:55:03] <weeb1e> I'll take a look at that, thanks
[05:55:37] <NaN> mongo with node I suppose, to make it real time, isn't?
[05:55:43] <weeb1e> My application queries ~20k servers and records integers representing server number, client count, ping and timestamp
[05:56:33] <ranman> weeb1e: is it the right load that it can't handle?
[05:56:38] <weeb1e> The web backend application then renders graphs for the data on the web front end, and also receives realtime updates from the query backend
[05:57:02] <weeb1e> ranman: It's just storing the amount of accumulated data
[05:57:47] <weeb1e> If it ever had to become one, I would do that
[05:58:19] <weeb1e> I feel that such small amounts of stats data is not really a good fit for mongo documents though
[05:58:38] <weeb1e> Because the keys and bson data will use up a lot more space than the actual data
[05:58:44] <ranman> weeb1e: if you go to mongodb world there's a talk about something similar on the scale of 4GB I think it's called weathermen or something
[05:58:58] <ranman> weeb1e: are you using shortnames for the keys?
[05:59:58] <weeb1e> Again, when all I need to store is 4x integers, the keys will still use more space than the data, and the _id will use more than everything put together?
[06:02:18] <ranman> that's why I think you should look into bucketing, larger documents by minute / hour / etc.
[06:02:37] <weeb1e> Yeah, that is what I'm planning on doing anyway
[06:02:45] <weeb1e> But I still want as much detail as possible
[06:02:55] <weeb1e> Since I allow zooming into graphed periods of data
[06:02:58] <ranman> weeb1e: you can do that without loosing any granularity at all
[06:03:48] <weeb1e> Yeah that's true, though it still will end up using a lot of disk space, and at some point mongo performance will become an issue again
[06:04:02] <weeb1e> So I need to decide how much granularity will be sustainable
[06:05:17] <ranman> weeb1e: I think this might help? not sure: http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/
[06:06:50] <weeb1e> That sounds like it's on the right track, I will read through it when I have time
[07:24:30] <Diplomat> is it possible to have like.. 3 read servers and 2 write servers.. like i write into 2 servers and 3 servers will have consistent data
[07:25:24] <Diplomat> so 3 servers will be optimized for reading (SSD etc) and 2 servers are optimized for receiving data and then sending that data into read servers for access
[08:14:43] <kali> lkthomas: double quotes around "log_lines"
[08:15:26] <kali> Diplomat: you can have one "write" server (the primary node in a replica set) and use the other to performs reads
[08:15:57] <kali> Diplomat: to have two "write" server, you need to shard, and in that case every node get only half of the data
[08:16:25] <Zelest> kali, as for my pull and push within the same query, that's not possible.. but nothing stops me from adding the new url first and remove all the old ones afterwards :D
[08:16:30] <lkthomas> it works, thanks. After clone the disk usage didn't change much
[08:16:39] <lkthomas> kali: should I look at storageSize or size itself ?
[08:28:48] <kali> well, you already know the ns, you should not mess with the "v", and the name is usually generated ok. so you only need to call createIndex with the key
[08:39:15] <Diplomat> @Kali: well it's okay if it gets half of the data.. as long as it writes fast and stores info so i can use processor to process that data and store it in read only server
[08:39:26] <Diplomat> I was thinking about using Cassandra for writing and MongoDB for reading
[08:40:44] <kali> have you got serious reason to think one or the other is not able to do both ?
[08:40:56] <kali> one database is complicated enough...
[08:41:36] <Diplomat> well i need to be able to write raw data into write server even when read server for queue is down
[08:42:35] <Diplomat> I'd need to have it like MongoDb (read only) -> Queue -> Raw processor -> Cassandra -> Processor -> MongoDb
[08:43:41] <Diplomat> So when MongoDb is down I still have queue which servers as like a little buffer between mongo and raw processor
[08:44:57] <Diplomat> because mongodb storage is most important part of the system and when it goes down I can use cassandra's data as until mongo is up again
[09:12:17] <dragoonis> If I have a mongodb collection in production, and I want to run a background job to clear out that collection and update it with new results. What's a suggested technique for going about this to ensure there's no loss of services between when swapping the live collection with the newly generated collection
[10:26:02] <dragoonis> Derick, does the php extension for MongoDB support db.collection.copyTo()? I'm not seeing this in: http://www.php.net/manual/en/class.mongocollection.php
[10:52:42] <G1eb> anyone else doing the mongo university here?
[10:54:22] <Diplomat> guys, is there a way to update like 1000 rows at once ?
[14:02:56] <tscanausa> so I have secondary preferred through my client, but load / ram has jumped 4x on the primary then the secondaries.
[14:15:55] <G1eb> saml, so if you do a find and mongo hits an index it is not considered a collection scan? (since it does not scan the whole collection)
[14:16:16] <saml> G1eb, from my understanding, yes.
[14:16:32] <saml> G1eb, what's the context? why do you want to know?
[14:16:44] <G1eb> ah I get it now, it is except for probably the case when mongo only hits the index when specified in the sort
[14:17:08] <G1eb> yeh, it was a mongo homework assignment, collection scan was true
[14:17:41] <G1eb> I figured anytime it hits the index its not a collection scan but clearly its only the case when it uses find
[14:18:24] <saml> what school gives you mongodb homework?
[14:18:32] <saml> quit that school unless it's free
[14:35:07] <tinco> well you could use a graph database of course, but your graph database is just an abstraction over a column store, just like a document store is
[14:35:49] <Nodex> I personaly prefer to use the right tool for the job
[14:55:18] <gskill934> so, i thought a JSON format like mongodb have is better than a 2 dimenational table
[14:55:47] <boutell> Thinking about storing permissions in mongodb. Right now I have arrays holding the ids of people and groups who can, let’s say, edit a specific thing: editPersonIds and editGroupIds. By indexing these fields I can efficiently find out if Bob can edit something (in his own right or via a group he’s in).
[14:56:28] <boutell> but I’d like to generalize to more permissions, and I’m wondering if I can efficiently index it still. For instance, there might be “view”, “contribute” and “edit” permissions, and more later.
[14:57:03] <boutell> one possibility is a single array in which the values include both an id and a prefix indicating the permission, like this: permissions = [ ‘edit-57’, ‘contribute-42’ ]
[14:57:55] <boutell> I wonder if I can index a two-dimensional array instead though. permissions[‘edit’] = [ ’57’, ’92’ ]
[14:59:42] <boutell> I suspect my one-dimensional array is the best that can be done here.
[15:04:41] <tscanausa> I guess map reduc eonly works on the master which is different from what that docs say
[15:11:05] <Nodex> boutell : I store permissions in a 1 dimensional array
[16:31:08] <no_mind> I am using flask with mongoengine. I have a date input field that is mapped as DateTimeField in the model. When I try to validate the form, I consistently get an error "Not a valid datetime value" . How do I fix it ? The date is sent from the browser in Y-m-d format
[17:56:43] <proteneer> https://aws.amazon.com/marketplace/pp/B00CO7AVMY/ <-- do these AMIs support SSL?
[19:19:47] <travisgriggs> all of a sudden, pymongo.MongoClient.database_names() has quit working. it’s failing in createPrivateMap on a file that hasn’t been written to in months (don’t use that db anymore, just haven’t cleaned it out yet). any suggestions? I have zero mongo admin experience. til now, i only had to apt-get install it and it’s just worked wonderful. i *am* on a 32bit box (i know i need to switch to 64 bit some day), but i should be way below memory
[19:24:41] <Zelest> ugh, anyone happen to know when 2.6 is ported to FreeBSD? :o
[19:38:37] <travisgriggs> a reboot of my box doesn’t fix the error. any help much appreciated. is there a better place to ask these questions?
[19:42:05] <Joeskyyy> travisgriggs: Only time I've ever seen that kind of error is when your resources are being overutilized
[19:44:13] <Joeskyyy> What's your free -m look like?
[19:48:16] <Odin^^> Hi guys.. Anyone here ? I’m having some strange issues with mongo. Doing a query on two fields, one string and one collection. If I do either ones I get the right data, if I do them together I get 1/10th of results. And when I hint the query it uses some random index nothing to do with the query.
[20:04:17] <cheeser> use the same query values then
[20:07:42] <Odin^^> ok, it’s an index updated when documents change ?
[20:08:09] <cheeser> yes, indexes are updated when documents change.
[20:11:42] <Odin^^> Ok either this is not working or i’m a fucking retard. If i’ll need help can someone here do that for $$ ofcourse ? Needs some way of identifying bcz I have a client production server that I need give access to.
[20:30:08] <Odin^^> Does index creation sequence has anything to do with how indexes are slected when you query ?
[20:30:37] <cheeser> the order the indexes are created? or the order in which the fields are listed in an index?