[00:05:28] <lacrymology> this already answers me, but is really no-relations a good way of doing things? I mean say I make a social network, do I really need to search on every user's contact list to modify them when I'm deleting a user account? http://stackoverflow.com/questions/3128200/mongo-db-relations-between-objects
[00:12:42] <preaction_> lacrymology, a single $remove query would do it, assuming the contact list was an array inside of the user document
[00:16:05] <preaction_> lacrymology, http://docs.mongodb.org/manual/reference/operator/pull/ <- this looks to be what you want
[00:17:37] <lacrymology> preaction_: but I'd need to run that for every user that has the deleting user in their list, or maybe on all of them, if the relationship's not symmetrical..
[00:19:09] <lacrymology> preaction_: compared to an SQL's `DELETE FROM relations_table WHERE to_id = deleting_user.id OR from_id = deleting_user.id; I'd need some sort of loop (so far I've only seen mongoose and I saw this being done by iterating over the User collection)
[00:19:11] <preaction_> lacrymology, you would run one query. only one query: update({ $in: { contact_list: "username_being_deleted" } }, { $pull: { contact_list: "username_being_deleted" } } );
[00:19:30] <preaction_> hell, i'm not sure you need the { $in ? } part
[00:20:16] <lacrymology> preaction_: ah, ok. I guess I'd seen a simplified example
[00:20:35] <preaction_> no, you saw an example that used an ORM, and so had to use only the ORM's features
[00:20:58] <lacrymology> preaction_: mongoose is considered an ORM?
[00:21:24] <preaction_> that's what it looks like to me, if we're both talking about mongoosejs.com
[00:21:42] <lacrymology> I'm pretty sure it can run "raw" queries as well
[00:25:22] <lacrymology> preaction_: but ok, thanks, that clarifies things a bit
[00:28:17] <lacrymology> preaction_: imagine the contact list is something like [{ username: 'foo', groups=['a', 'b', 'c'], picture='path/to/pic.png'}, {username: ...}] will your query remove the whole object from the list, or will it find nothing because there's no element on the list that's just 'username_being_deleted'?
[00:29:28] <lacrymology> or maybe change it to something like `update({ $in: { contact_list: "username_being_deleted" } }, { $pull: { contact_list: { username: "username_being_deleted" } } } );`?
[00:30:31] <preaction_> well, the query would need to be updated to match the structure you're using, but sure why not
[00:37:43] <lacrymology> preaction_: I don't know, I'm asking =)
[03:05:17] <capnkooc> hi I'm trying to filter a array of ObjectsId with $gte and $lt I'm filtering with the create date stored on the id but I don't know really how to do it right
[03:05:34] <capnkooc> my code is something like this:
[03:06:45] <capnkooc> for example it works if I change 'resources.queues.ids' with '_id' and i can get the created accounts with the range of dates on that filter
[03:07:12] <capnkooc> but I don't really know how to do the same thing with a array of objectId's
[03:07:49] <capnkooc> can some body please help me out or maybe point me to the right direction?
[07:39:34] <chia> Hi, I am new to mongodb, and have a qs, maybe this is the most asked qs, but anyway :), my qs is, i have a 2M items to store, each item will be updated once a day, and each item will increase, items has a property which will grow everyday like subscription, should i store those items as documents or items as collection, its a qs of One big collection or huge collection :).
[08:43:04] <Jb__> does anyone uses the mongo-c-driver for win64 ?
[08:43:57] <Jb__> I ve got some issues compared to the 32bits version of the same client program
[10:09:08] <chia> Is it good to have 2 million collections? i wanted to make an "activity stream", and was thinking of having a collection for every user? Is it a good decision or a bad?
[10:12:54] <chia> I heard creating lot of collection has no disadvantages, i was thinking of keeping all information of a user in their separate collection, which for some users can have from 100 rows per day to 10000 rows.
[10:14:33] <chia> "Generally, having large number of collections has no significant performance penalty and results in very good performance. Distinct collections are very important for high-throughput batch processing."
[10:14:40] <chia> from the docs http://docs.mongodb.org/manual/core/data-modeling/#data-model-large-number-of-collections
[10:15:11] <chia> But ofcourse i dont know what is considered a large number :)
[10:15:20] <Nodex> I dont think that means 2million collections
[10:17:51] <chia> I see, but i was thinking of using the ttl feature to autodiscart entries older than a week, but if i put this information in a collection, with {id: user_id, activities: []} the activities array will for some users be too big, and i dont know if that is good or bad.(however the document 16MB limit will not be crossed)
[10:19:36] <Nodex> for me if I were doing a timeline series I would give eash user one document a day or w/e
[11:33:56] <strigga> Hi there, I am pretty new to mongodb. I have set up a mongo instance and can suyyessfully write documents intot he DB. Now (Level 2 :) ). I am trying to write files into GridFS of the same instance. I am using PHP. When I do this, I get the error: PHP Fatal error: Uncaught exception 'MongoGridFSException' with message 'Could not store file: Can't take a write lock while out of disk space' in /var
[11:38:17] <strigga> error failed to allocate new file: /var/lib/mongodb/stomb.ns size: 16777216 errno:13 Permission denied
[11:38:47] <strigga> allright - funny that writing "normal" documents semms to work - lemme double check
[11:40:30] <strigga> Derick: Mongo runs as user mongodb on linux, right? the Folder and the contents belong to mongodb, so I do not see an issue there. I have to say that var/lib/mongodb is a symling to a mount (which is owned by mongodb as well)
[13:30:22] <strigga> *godda**it* arrrgh. Trying to save binary data in gridfs and whatever I try, I get an error message: PHP Fatal error: Uncaught exception 'MongoGridFSException' with message 'Could not store file: _filename.empty()' in /var/www/inc_functions/snap_functions_v2.php:43 -
[13:30:38] <strigga> this is the code i am using (simplified as much as possible) http://pastebin.com/GVEYN25T
[13:35:30] <Nodex> you have to write your data to a file and feed it the file iirc
[13:52:04] <strigga> still strange, but I think I am on a good path :)
[13:55:07] <strigga> Nodex: I removed the .ns file again to have mongo generate it itself, and I still get the same error.. Writing to a "normal" mongo-document collection in the same folder works fine.
[13:57:57] <Nodex> can you not just chown the parent directory to mongodb?
[14:18:34] <noverloop> question about restoring mongodb backups
[14:18:50] <noverloop> can I just restore the primary and assume the slaves will sync?
[16:31:38] <murilo> Hello guys, I am running mongo 2.2.2 and I have some documents like { A: 1, B: 2, C: 3 } and some others like { B: 1, A: 2, C: 3 }, but when I run a find() I want all of then like { A: 1, B: 2, C: 3 }, do you know how to sort the fields on a query output ?
[16:32:57] <Derick> that's not possible unless you have an ODM or a layer inbetween. MongoDB stores and returns as-is.
[16:45:47] <ctorp> Is it possible to export a prod aws dynamodb to a mongodb for local testing or is there a better method? I'd rather not test against prod data where possible
[16:56:39] <gazarsgo> ctorp: you could prob use the new data flows thingy to do it, but easiest is dynamo->s3->mongo
[16:57:07] <gazarsgo> might need data pipeline to go from dynamo to s3 anyway, i forget
[16:58:17] <ctorp> gazarsgo: have you heard of this being done before? I can't find any examples on google, so I was thinking there must be a different standard. Do most AWS/Heroku-based companies test against prod systems?
[16:59:05] <gazarsgo> it's very common to scale out of dynamo because of pricing issues at scale, but i haven't run the new numbers, i think you get a lot more headroom in dynamo now with the reserved throughput
[16:59:06] <nDuff> ctorp: It's common to have multiple AWS accounts -- the QA one completely separate.
[16:59:22] <gazarsgo> multiple AWS accounts is just overhead, you can isolate environments fine with IAM
[16:59:38] <gazarsgo> but yes, many people do multiple accounts anyway. i prefer not to
[17:00:09] <ctorp> You guys are awesome. I felt like I was fumbling in the dark when looking for this info.
[17:01:37] <ctorp> Hmm. The Data Pipeline service supposedly can auto-generate the EMR and procedure for S3 to Dynamo and back, but the throughput is outrageously limited in all the tests I ran
[17:02:31] <ctorp> It took 12 hrs to move less than 2gb of data from dynamo to s3 at full throughput ratio
[17:02:57] <ctorp> (Which is kind of what lead me down this path of questioning)
[17:03:15] <gazarsgo> you'll have to tweak the procedure to parallelize it
[17:07:42] <gazarsgo> there's a detailed script here: http://stackoverflow.com/questions/13630641/backup-aws-dynamodb-to-s3
[17:07:57] <gazarsgo> and all you'd need to do is parameterize your hive export (range based)
[17:27:49] <gazarsgo> is 1TB too large for an initial mongodb cluster ?
[19:13:27] <Gx4> if I did coll = conn.db['ts'] for t in tracks: train = { 'name' : t.name, 'size' : t.size} coll.insert(train) how do I retrieve those values?
[19:19:22] <Gx4> is there a visual tool that shows mongodb data ?
[20:05:23] <fjay> i wanted to make sure i wasnt sniffing glue or something
[20:09:51] <A_Nub> Hi, I was wondering if anyone here is experienced with the mongodb code base and could point me in the right direction. I am essentially looking for 2 parts in the code 1) Where ClientInfo objects are created and 2) where incoming queries (find, insert, update, upsert, etc...) are processed per connection.
[20:10:26] <A_Nub> I am browsing around it blindly right now, just would be nice to get some more direction.
[20:22:11] <fuzzpedal> Hi, I'm having problems with adding pymongo to twisted in my buildout - any help appreciated :)
[20:27:02] <prawnsalad> hey guys. can you use collection capping on embedded collections?
[20:28:06] <prawnsalad> i have multiple logs (capped colelctions) per user that would make sense to be stored under each user document
[20:56:40] <Guest64045> hi guys, Im doing a app when custom data could be sent by the customer. How could I search for these custom datas ( is not necessary index ) ?
[21:01:14] <Guest64045> hi guys, Im doing a app when custom data could be sent by the customer. what is the best approach to handle queries for this fields. This fields are not indexed, should i create on index for every custom data "searchable" ?
[22:20:39] <Gx4> is how do I create unique index? collection.create_index('id', { 'unique':'true' } ) ?