[01:36:29] <Guest76149> Hi, I'm having trouble linking documents in the mongo shell and was wondering if I could get some help? I have an owner document and an asset document. When I insert an asset I would like to add an attribute OwnerId set to an id from an owner doc I query. Any ideas?
[02:07:47] <IAD1> A bunch of RDBMS guys walk into a NoSQL bar. But they don't stay long. They could not find a table.
[02:38:04] <Guest92855> Hi, I'm having trouble linking documents in the mongo shell and was wondering if I could get some help? I have an owner document and an asset document. When I insert an asset I would like to add an attribute OwnerId set to an id from an owner doc I query. Any ideas?
[02:39:50] <crudson> Guest92855: I would suggest what you describe :) Set an attribute to owner._id
[02:44:30] <Guest92855> How would I go about that though? What I have tried so far is in my insert, $set that new field to the result of a query. However the query returns {_id : ObjectId('somestring'}. how can I have it return only the ObjectId without the field name?
[02:48:26] <Guest92855> This is my insert: db.description.update({Brand : "Lenovo"}, {$set : {OwnerId : db.owner.find({Name : "CPU Guyz"}, {_id : 1})}})
[04:08:30] <SirFunk> this may be a silly question. is there any way to extract the timestamp from an oid in json?
[04:46:47] <IAD1> SirFunk: if you need a timestamp - just create a new field for it
[04:47:40] <SirFunk> yeah... the problem I am running into is that I am using mongolab and creating records with REST and it doesn't have a way to submit Timestamp()
[04:47:55] <SirFunk> I could create one on the client side and store it as text... but that's kinda icky.
[04:48:06] <SirFunk> i figured if it was already part of the id.. then... yay
[08:46:07] <MongoDBIdiot> are you talking to yourself?
[08:47:11] <ppetermann> i don't think data is a reserved word
[08:50:22] <ndee> hi there, I have a collection of different search queries and I would like to create reports. http://pastebin.com/5PfTHkTP <-- that's a default document. My query for getting the data at the moment is as follows: http://pastebin.com/UbDehWWc <-- That query takes a couple of seconds since there are around 800'000 documents. What would be the best way to speed that up?
[08:53:18] <ppetermann> ndee: why using a unix timestamp, and not a date field? also, have you set your indexes right?
[08:54:05] <ndee> ppetermann, I'm used to unix timestamps but yes, I could change that to a date field. I have an index like this: db.queries.ensureIndex({timestamp:1, searchMake:1});
[08:58:04] <ndee> ppetermann, would it be faster to change that to a find({searchMake: 222, timestamp: 292349234}).count() and iterate over the timestamps and searchMakes ?
[09:02:41] <NodeX> fredix: is that your exact string?
[09:14:02] <ndee> NodeX, ok, to be clean, I dropped all indexes and added the following one: db.queries.ensureIndex(searchMake:1, timestamp:1). But that didn't speed up following query: db.queries.aggregate({$group:{ _id : { searchMake: '$searchMake', timestamp: '$timestamp'}, queriesPerMake : { $sum : 1}}});
[09:15:21] <yuriy> after mongo 2.2 server dirty shutdown I'm getting ""expected to be write locked for" errors when doing MapReduce. fallback to mongo 2.0 doesn't produce such errors. can you point out where I should look at to fix this issue with 2.2?
[09:17:21] <ndee> NodeX, I added an index on searchMake and a db.queries.distinct("searchMake") and it takes 1018ms. I know that there are 800'000 records but that is pretty long IMO.
[09:18:16] <IAD1> ndee: you can create an index on "searchMake"
[09:18:36] <ndee> IAD1, that's what I did. I do have an index on "searchMake"
[09:20:14] <IAD1> ndee: how much RAM has your server? It should be enough for the indixes, at least
[09:21:28] <ppetermann> ndee: what does explain say?
[09:21:36] <ndee> IAD1, the server has 16GB of RAM, although I have to say it's a default mongo installation without any "tweaks" on ubuntu 10.04
[09:24:19] <ndee> ppetermann, can't use .explain on .distinct
[09:26:51] <NodeX> distinct on large datasets is always slow
[09:27:02] <NodeX> it's the building of the array that takes the time, not the query
[09:44:43] <NodeX> then 1 second is as fast as it will get
[09:47:44] <ndee> NodeX, wow, that is kinda dissapointing :(
[09:48:28] <ndee> I guess the right tool for the right job is still true :D
[09:48:44] <ron> umm, what NodeX may have meant to say was "then 1 second is as fast as it will get with your current setup".
[09:49:18] <ndee> ron, yeh, but still. I'm gonna test it with a mysql database
[09:49:43] <NodeX> ndee : right tool for the right job is true :)
[09:49:59] <NodeX> if sql fits your aggregation needs better then use it :D
[09:50:26] <ndee> that's what I'm gonna do :D I was just so happy with the schemaless setup :D
[09:51:38] <MongoDBIdiot> yes, using tools without knowing what you are doing
[09:52:00] <NodeX> it doesn't mean that you have to use MongoDB for everything or *SQL for everything, plenty of people use sql / mongo for parts of their app
[09:52:28] <NodeX> and dont listen to MongoDBIdiot - he is retarded and doesn't have a clue what he is doing :)
[09:52:40] <NodeX> and is a general menace to this channel
[09:54:25] <ndee> I guess a mix of SQL and NoSQL is the way to go.
[09:54:45] <NodeX> ndee : 10gen are always improving performance
[09:54:57] <NodeX> so in the future you might get better results
[09:55:10] <ndee> I think I found the answer to everything right now. In the end, a mix is nearly always the best solution.
[09:55:15] <ron> ndee: the solution can be as simple as adjusting your data model, but unfortunately I didn't follow the whole chat, so I can't advice any further.
[09:55:21] <ndee> You cannot always be nice but also not always be bad.
[09:55:42] <NodeX> personaly for me I adjust my data model so I have less in my stack
[09:55:56] <ndee> ron, yes, I'm thinking about that too now, I come from the sql-world so the data model might not be the best :D
[09:55:57] <NodeX> but that's how my apps are suited, it's probably not the same for you
[09:56:39] <ron> ndee: most people come from the sql world before as that what's been there. the concept of properly modeling data is a known issue during the transition phase.
[09:57:28] <NodeX> best advice is to forget everything you know about SQL if you want to adopt a schemaless way of attacking a problem
[09:57:31] <ndee> actually, what I wanna do is the following: analyze a apache log file and that log file contains the search queries. So I extract the data and put it into a mongodb to analyze the data.
[14:47:02] <vhpoet> ah ok, now I see. Thank you :)
[14:49:51] <BramN> Is it possible to use find() to find regexes, stored in MongoDB, that match a value passed in the query? So I store a number of regexes in Mongo and want to find the documents whose regex matches with a value i send to find()
[14:53:27] <BramN> Okay, thanks...never heard of it before...guess I'l just have to loop through them...
[15:03:20] <bhosie> i understand that the oplog file will split multi updates into individual updates but what about batch inserts? are those also split into individual inserts? if i have a write intensive collection with batches of 20K docs being inserted roughly every 30 seconds, do i / should i consider increasing my oplog size? I have a 3 member replica set
[15:05:28] <vhpoet> Hey, one another very simple question. http://pastie.org/5119235
[15:07:34] <Gargoyle> vhpoet: Not entirely sure what dbref() is doing, but I'll have a guess that your user data is not being stored under the field names you think it is!
[15:08:10] <vhpoet> it's a reference to another collection's document
[15:08:41] <Gargoyle> IIRC, any such references are just driver sugar. It's not actually part of mongo.
[15:09:19] <Gargoyle> However, I suppose the answer might be to change your find to match the data. ie find({'user': DBRef(…)})
[15:09:44] <vhpoet> oh, ok, thank you. Maybe I should read php mongo doc for this.
[15:24:10] <nopz> What is the fastest, updating a whole document when I have to change only one key:value of this document, or use $set{key: value}
[15:27:27] <unpaidbill> Has anyone here used the perl driver and the aggregation framework together? I'm having some trouble figuring out how to structure my request
[15:27:29] <skot> vhpoet: there were two problems. One, field names are case sensitive, two, you need to use the correct type as has already been suggested: DBRef(…)
[15:28:10] <nopz> skot, Ok, because I'm not sure about performance for a single document.
[15:29:02] <skot> nopz: in general $set is best. It reduces network data, oplog storage, and is more readable in many cases.
[15:57:52] <NodeX> also look at gridfs if you want to store it in the database
[15:59:19] <esc> is mongodb developed by a company?
[16:03:16] <skot> The main contributor is 10gen, yes.
[16:03:22] <finalspy> Hi everybody, new to mongo and sharding, I'd like to know how to shard a collection on a key that may be null, is there a way to do that ?
[16:03:48] <skot> yes, null is a valid value; the shard key must be immutable for the docs so you cannot change it.
[16:04:12] <skot> (a delete + insert works if you need to change the value)
[16:04:49] <finalspy> in fact i get this : "errmsg" : "found null value in key { area: null } for doc: { ...
[16:05:25] <finalspy> when running db.runCommand("shardcollection" ...
[16:06:05] <finalspy> i'm using mongo 2.2.0 on linux mint (debian) installed from 10gen repos
[16:07:52] <esc> is there a fast way to check if an _id is contained in a collection?
[16:12:58] <skot> you can also use a covered query to only use the index to return only the _id value, but count is effectively the same.
[16:32:10] <finalspy> so I want to shard collection, my index is on area field but sh.shardCollection(... gives me "errmsg" : "found null value in key { area: null } for doc: { ...
[16:32:51] <finalspy> In fact I noticed that some docs didn't even contain the area atttribute which mongo is complaining to be null ...
[16:33:23] <finalspy> Does that mean I can only use shard keys from fields presents in all documents of a collection ?
[16:37:28] <finalspy> So question is : Is it possible to use a field not present on all the documents of a collection as a shard key ? ... seems not but I cant find anything on that.
[16:46:10] <kevin_> Hi Guys, quick question, is there a special option in mongo to get the distance from the queried point to the records received, ex, place 1 is 0.1m away, place 2 is 0.11m away
[16:46:58] <kevin_> (and also sort the results by distance)
[17:54:28] <wtpayne> Good thing mongodb is nice & simple. - not all that much reading required.
[17:54:32] <ruffyen> as for your space recovery, no i never have but you should be able to just stop mongod, add space to drive, delete mongo.lock and then start mongod again
[17:54:47] <wtpayne> Heh. I cannot add space to the drive, unfortunately.
[17:57:10] <wtpayne> I keep getting SyncClusterConnection::update prepare failed exceptions.
[17:57:29] <wtpayne> ... Actually, come to think about it, it might be the config servers that have gone down.
[17:57:58] <ruffyen> yeah googling that errro points at config servers
[18:02:49] <wtpayne> Thanks... trying to find out which machines the config servers are on and bounce them.
[18:14:44] <wtpayne> Drat. Bouncing the config servers did not work.
[18:24:37] <wtpayne> Ok. So I have a bunch of DBs, some of which are old and not needed anymore.
[18:25:33] <wtpayne> I am happy to totally loose those DBs, so deleting the files: <DBNAME>.* from the mongodb data directory (on all shards) does not seem (to me) to be that unreasonable.
[18:25:50] <wtpayne> Particularly since the dropDatabase() command does not work.
[18:39:50] <wtpayne> Nobody going to try and stop me?
[21:20:35] <Dr{Wh0}> trying to test sharding and see how it scales but I am not getting expected results. I have 4 shards setup and if I run my test insert app to add 5m rows as fast as possible I get 120k/s inserts if I direct each app to a specific shard. If I run just 2 apps connected to 2 separate routers connected to a shareded collection where I see "ok" distribution I end up with about 30k/s so it seems as if it does not scale correctly. Where could the bottleneck be?
[21:20:37] <Dr{Wh0}> I tried with 1 router or 2 routers I have 3 config servers.
[22:07:25] <TecnoBrat> one of the advantages is family, heh
[23:32:17] <jmpf> I must be doing something wrong -- http://pastie.org/private/xgs1ulpuzy2ljbe46ciq0q - why is it doing a full collection scan even w/hint?