PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 14th of January, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[08:38:11] <[AD]Turbo> hola
[10:36:04] <NodeX> _o_
[10:36:21] <ron> NodeX: http://9gag.com/gag/6327098
[10:38:39] <NodeX> wtf lol
[10:38:46] <NodeX> people are crazy
[10:39:04] <ron> yes.
[11:54:14] <zell> i'd like to query an a "document B" property contained in a list of embed "document B" in a "document A".
[11:55:27] <kali> zell: dot notation
[11:55:29] <zell> for instance i'd like to query "name" property on the folowwing: { id: 1, foo: bar, embeds: [ {id: 1, name: '1'}, {id:2, name: '2'}]}
[11:55:46] <zell> kali: i tryed
[11:56:31] <zell> kali: db.document.find({'embeds.name': '2'})
[11:57:15] <zell> i also tried: db.document.find({'embeds': {$in: { name: '2'}}})
[11:57:45] <zell> s/name/embeds.name/
[12:00:25] <kali> zell: you're doing something wrong. "embed.name": 2 should work
[12:01:57] <kali> zell: http://pastie.org/5682520
[12:14:05] <NodeX> he's probably expecting the embedded doc back not the parent lol
[12:15:02] <NodeX> which would be a nice addition to projecting what's returned ... this || this.parent()
[12:16:35] <kali> NodeX: i should probably make a macro for this question too :)
[12:19:42] <zell> kali: thx
[12:29:12] <NodeX> heh
[12:29:54] <abhi9> anywan to extract only date from datetime object, for query similar $dayofmonth?
[13:18:46] <akaIDIOT> hmm
[13:19:02] <akaIDIOT> where can i find a version mapping for mongodb version → java api version
[13:19:12] <akaIDIOT> curious that those are so different ;(
[13:21:02] <kali> i don't think such thing exists, but the latest versions should work reasonably with anything 1.6+ i would say
[13:21:13] <kali> maybe some glitches to expect with replication
[13:21:23] <akaIDIOT> alright
[13:21:30] <akaIDIOT> so just downloading 2.10.1 would suffice
[13:21:36] <akaIDIOT> running server 2.0 or 2.2
[13:21:38] <akaIDIOT> not sure
[13:23:02] <kali> yeah. no problem with that. no miracle either of course: stuff like aggregate() will not work on a 2.0
[13:23:19] <akaIDIOT> not using many funky things
[13:23:29] <akaIDIOT> so i'll be fine
[13:23:35] <akaIDIOT> thanks for the response anyhow :)
[13:24:02] <akaIDIOT> the version thing is ungooglable btw, not sure if i'd be the only one wondering P
[13:24:05] <akaIDIOT> *:P
[15:55:45] <balboah> how do you query for a specific query in the profiler output? I want to filter out queries containing a "dotted.query.like.this"
[15:56:19] <balboah> since dots are already being special, how do you escape it so that it interprets it as the field name and not a sub document?
[15:59:36] <NodeX> try escape with \.
[16:30:45] <morphic> hello, I have a bson with the fields 'value', 'place', 'date', and I want to store more informations about the place (like x, y, street) whats the best way to do it, since I will have the same place for different documents
[16:33:20] <Kaim> hi all
[16:33:43] <Kaim> I have problem about insert logs into mongo shard cluster
[16:33:58] <Kaim> mongos return errors such as :
[16:34:06] <Kaim> Mon Jan 14 16:45:33 [conn3352] warning: shard key mismatch for insert { _id: ObjectId('50f4281d6291df0380010839'), time: new Date(1358178268000), __broken_data: BinData }, expected values for { path: 1.0 }, reloading config data to ensure not stale
[16:34:22] <kali> morphic: can you be more specific ? have you imagined one or more schema that we could comment ?
[16:34:39] <Kaim> and my insert hash got path key :/
[16:35:14] <kali> Kaim: are you sure your data is corrent ? because the __broken_data is frightening
[16:35:23] <kali> Kaim: correct^
[16:36:20] <Kaim> yes it is correct
[16:36:27] <morphic> kali: i'm a little lost with the relationship in nonsql, The idea in SQL is table [Transactions, fk(place_id), value... ] 1 <- 1 [Places, x, y, pk(id) ]
[16:36:48] <Kaim> at the begining of insert all is ok, then after some minutes sth went wrong
[16:37:42] <kali> Kaim: i think you probably have one document that breaks your import
[16:38:00] <morphic> maybe I can store the Places's ObjectId on Transaction document
[16:38:03] <kali> Kaim: you need to find it
[16:38:43] <kali> morphic: the important thing to think about is: how you will access this data
[16:39:28] <kali> morphic: if it rarely changes and is read constantly, then yes, it's a good idea to embed some or all data from Places in Transaction
[16:40:02] <Kaim> kali, I found sth
[16:40:07] <Kaim> mongos> sh.status()
[16:40:08] <Kaim> Mon Jan 14 17:38:37 decode failed. probably invalid utf-8 string [óF]
[16:40:08] <Kaim> Mon Jan 14 17:38:37 why: InternalError: buffer too small
[16:40:08] <Kaim> Mon Jan 14 17:38:37 InternalError: buffer too small src/mongo/shell/utils.js:1018
[16:40:14] <morphic> yeah the place and coordinates will be grabbed from google maps so, it is more reads and writes.
[16:40:15] <Kaim> maybe that's the pb :/
[16:40:26] <morphic> than writes*
[16:42:30] <kali> Kaim: what version of mongo are you using ?
[16:42:45] <foofoobar> Hi. I have not used nosql before and because I think it's an interesting topic I am looking to integrate it to an future project for me. Now I am currently coding something in RoR which is similar to a job board. You can post job offers and people can view them. I already wrote half of this very simple app (most is UI stuff) with a relation sql db
[16:42:56] <foofoobar> I am now thinking what advantages could be there to use a nosql approach here
[16:43:07] <Kaim> root@mongos-1:~# mongod --version
[16:43:07] <Kaim> db version v2.2.1, pdfile version 4.5
[16:43:07] <Kaim> Mon Jan 14 17:41:56 git version: nogitversion
[16:43:16] <foofoobar> Can someone tell me if such a project would be a good start to try out?
[16:43:34] <foofoobar> currently I only have to tables in my relational approach: jobs and users
[16:44:19] <NodeX> foofoobar : http://www.maxhire.net/ <--- written fully on Mongo
[16:44:21] <NodeX> oops
[16:44:23] <NodeX> wrong siet
[16:44:39] <NodeX> http://www.jobbasket.co.uk/
[16:44:43] <kali> Kaim: it looks like you're trying to push invalid data to mongodb, like a badly encoded string
[16:44:45] <NodeX> try that instead LOL
[16:45:07] <kali> NodeX: confused about your own site ? :)
[16:45:14] <morphic> foofoobar: http://t.co/Mji8WrUQ
[16:45:21] <NodeX> sorry, I was pasting into skype at the same time
[16:52:37] <foofoobar> morphic, nice post, ty
[16:56:32] <foofoobar> morphic, but beside this, a new project is always a good way to try out new technologies
[16:56:49] <NodeX> foofoobar : check out job basket ^^ that's the sort of things you can do with NoSQL
[16:56:57] <NodeX> +technologies
[16:57:19] <foofoobar> NodeX, yeah, I did. I also read the blog post about their technology behind the page, sounds interesting
[16:57:47] <NodeX> it's moved on somewhat since then, now uses redis and node js for some realtime and some caching
[16:57:58] <NodeX> +as well
[16:59:17] <Kaim> kali, how can I clean that shit? :/
[16:59:23] <Kaim> mongos> use config
[16:59:23] <Kaim> switched to db config
[16:59:23] <Kaim> mongos> db.databases.find()
[16:59:23] <Kaim> { "_id" : "admin", "partitioned" : false, "primary" : "config" }
[16:59:23] <Kaim> { "_id" : "gnisko", "partitioned" : false, "primary" : "shard10" }
[16:59:24] <Kaim> { "_id" : "me", "partitioned" : false, "primary" : "shard10" }
[16:59:28] <Kaim> { "_id" : "\u0002referer", "partitioned" : false, "primary" : "shard10" }
[16:59:30] <Kaim> { "_id" : "3", "partitioned" : false, "primary" : "shard10" }
[16:59:31] <NodeX> use a pastebin dude
[16:59:32] <Kaim> { "_id" : "\u0001", "partitioned" : false, "primary" : "shard10" }
[16:59:34] <Kaim> { "_id" : "hod", "partitioned" : false, "primary" : "shard10" }
[16:59:36] <Kaim> Mon Jan 14 17:56:49 decode failed. probably invalid utf-8 string [óF]
[16:59:37] <foofoobar> Kaim, pasti.org
[16:59:38] <Kaim> Mon Jan 14 17:56:49 why: InternalError: buffer too small
[16:59:40] <Kaim> InternalError: buffer too small
[16:59:42] <Kaim> sorry
[16:59:49] <foofoobar> *pastie.org
[17:00:08] <Kaim> lol
[17:00:52] <kali> man, this is ugly
[17:01:07] <kali> can you reset your cluster ? :)
[17:01:29] <Kaim> that's what I thought...
[17:03:13] <foofoobar> NodeX, I'm still unsure how I would "port" such a relational way to a nosql way.. Let's assume I have the following model: http://pastie.org/5683646
[17:05:34] <foofoobar> How would you do this in a nosql way?
[17:05:59] <bean> foofoobar: you'd probably just have a user, that contains details about its category and job
[17:06:33] <foofoobar> bean, a job is related to a category, not a user
[17:07:15] <foofoobar> also a user can offer a lot of jobs
[17:07:24] <NodeX> foofoobar : it's hard to answer without knowing your access patterns
[17:07:28] <foofoobar> and how would you show a list of all jobs? you would had to iterate over all user
[17:07:40] <foofoobar> NodeX, What do you mean with "access patterns"?
[17:07:48] <NodeX> err how you access your data lol
[17:07:55] <kali> foofoobar: what kind of request you'll need to do
[17:08:01] <NodeX> or that ^
[17:08:04] <foofoobar> I'm just trying to think of the architecture
[17:08:07] <kali> the *only* thing that matters
[17:08:12] <foofoobar> It's just theoretical
[17:08:21] <NodeX> do users get queried the most, do jobs get queried most etc etc
[17:08:29] <foofoobar> kali, I want to get a list of jobs (also sortable by category)
[17:08:33] <NodeX> reads or write heavy
[17:08:36] <foofoobar> reads
[17:09:03] <foofoobar> I don't want to get a user list or stuff like this, the important point s getting a list of all jobs
[17:09:12] <foofoobar> *is
[17:09:12] <NodeX> I would store the category with the job for a nice fast query
[17:10:02] <kali> then that's your answer
[17:10:07] <kali> see ? nosql is easy :)
[17:10:10] <bean> yeah, a nice collection of jobs that contains what category they're in would work well
[17:10:12] <NodeX> LOL
[17:10:21] <NodeX> I thought that was implicit tbhj
[17:10:24] <NodeX> tbh *
[17:12:08] <foofoobar> so when I want to get all jobs for a specified category I have to iterate over all jobs?
[17:12:42] <NodeX> itterate ...
[17:12:45] <kali> you can have an index on the category
[17:12:51] <NodeX> ^^ index
[17:13:14] <foofoobar> okay
[17:13:16] <kali> or if that's not good enough, you can store the jobs in the category documents (i don't think you'll need to do that)
[17:16:03] <foofoobar> so in a relational sql I would have a category table with and id and a name. So when I now save the category in the job document I would just save what for the category? the name or and id ?
[17:16:56] <NodeX> tbh I wouldnt even have a relational table for category
[17:18:44] <foofoobar> So just save the name.. all right. And in case there are more than one field for the category? e.g. an icon-name? also put it in there?
[17:19:08] <kali> foofoobar: if you need to display it when you show the list of jobs, yeah
[17:19:13] <NodeX> you can embed anything should you need
[17:19:24] <foofoobar> So redundant data is not a problem?
[17:19:39] <kali> foofoobar: it's an issue, not a problem :)
[17:19:41] <NodeX> redundant data often means fast performance
[17:19:46] <NodeX> faster *
[17:19:53] <NodeX> as it makes for less queries
[17:20:41] <kali> foofoobar: the idea in mongodb design is to optimize your documents for reads as they are 99% of your typical web app requests
[17:22:44] <foofoobar> Sounds good. I think I will give it a try and look how it fits
[17:23:17] <NodeX> not everything works with nosql, but I am yet to be stopped by something that doesn't work
[17:45:53] <tworkin> what is the point of the comment at the bottom of this code snippet? What class is vals/Vals a member of? http://www.mongodb.org/pages/viewpage.action?pageId=19562815
[17:56:46] <tworkin> bson-inl.h:840:13: error: no matching function for call to ‘mongo::BSONElement::Val(unsigned int&)’ - why is there no symmetric error for BSONArrayBuilder.append(unsigned int)?
[18:04:37] <Kaim> kali, my cluster is clean now, but still have the pb
[18:05:46] <kali> Kaim: have you look for the document that you logging app is pushing and triggers the problem ?
[18:06:28] <Kaim> In fact I'm using fluentd and the mongodb plugin to do insert
[18:06:52] <Kaim> I patch the module to show me the insert query
[18:12:19] <Kaim> kali, btw do you have advice about max number of chunks per shard and max database size per shard?
[18:22:35] <nopp> Mon Jan 14 16:18:09 [repl writer worker 3] ERROR: warning: log line attempted (59k) over max size(10k), printing beginning and end ...
[18:22:43] <nopp> hi ... my mongos down
[18:22:45] <nopp> :|
[18:22:52] <nopp> replicaset 2 servers down
[18:23:01] <nopp> Mon Jan 14 16:18:09 Backtrace:
[18:23:01] <nopp> 0xade6e1 0x5582d9 0x3ba62302d0 0x3ba6230265 0x3ba6231d10 0x802e3e 0x6505b6 0x77d3dd 0x7c3659 0x3ba6a0673d 0x3ba62d44bd
[18:36:43] <foofoobar> j #network
[18:36:49] <foofoobar> sorry
[18:46:17] <kali> Kaim: nope, i'm not sharding my data
[18:47:31] <calvinow> Hello all. I'm looking for some insight into my schema design in Mongo. I'm dealing with a dataset where there are a potentially large number of 'nodes', each of which would be stored as a seperate document. I need to store relationships between these nodes, potentially from each node to every other node, although that worst case would not occur often. Currently, I'm storing these relationships as subdocuments in each node, b
[18:56:15] <kali> calvinow: make shorter lines, we lost you at "subdocuments in each node, b"
[19:09:44] <HappyPsychoD> Good day, is it possible to use one result set as the input for a foreach which requires a second query per record set.
[19:10:21] <kali> HappyPsychoD: in your application language, yes :)
[19:11:23] <HappyPsychoD> Damn... I was hoping I could do something like this
[19:11:23] <HappyPsychoD> var toQuery = db.user.find({'last_seen': { $gt : 1355198400}}).toArray();
[19:11:23] <HappyPsychoD> toQuery.forEach(function(entry){ printjson( db.session.find({ 'mac': entry['mac'] }).sort({ 'disassoc_time': -1 }).limit(1)["ap_mac"] ); });
[19:27:03] <calvinow> kali: but I'm concerned that may be sub-optimal, and I'd be better off storing these relationships as seperate documents (the relationship consists of two integer counters). Thoughts?
[19:27:16] <calvinow> kali: , but I'm concerned that may be sub-optimal, and I'd be better off storing these relationships as seperate documents (the relationship consists of two integer counters). Thoughts?
[19:27:32] <calvinow> kali: , but I'm concerned that may be sub-optimal, and I'd be better off storing these relationships as seperate documents (the relationship consists of two integer counters). Thoughts?
[19:27:38] <calvinow> yikes, sorry
[19:28:10] <kali> calvinow: the most performant way is to paginate the relationships
[19:28:30] <kali> calvinow: store them 1000 per document
[19:28:40] <kali> but it is also the most brain fucking one
[19:30:08] <calvinow> kali: My problem with that is that I use aggregate() to calculate a score based a subset (or all) of the relationships, which is complicated by not having them stored in the same document
[19:30:53] <kali> really ?
[19:34:03] <calvinow> I mean, complicated from a computational standpoint, not design... I need to get N nodes with the highest 'score', and if I seperate the relationships that seems not to be do-able using aggregation...
[19:35:46] <jaimef> so if you are dealing with a lot of slack/sparse data files, and want to compact a db, is it best to just nuke the data directory on a replica, then let it sync up, then promote to primary, and lather, rinse, repeat with the rest of the mongo servers?
[19:35:48] <kali> calvinow: if you have... say 1000 relations per page, you'll make 1000 less random than accessing your relationship collection
[19:36:20] <kali> jaimef: i do resync in that kind of situation
[19:37:10] <jaimef> kali: ok. just getting back into it in over a year and remember doing that previously
[19:38:18] <calvinow> kali: Yes. I guess my ultimate question is why not store them all in the same document? In this use case, that seems as though it would be faster to me...
[19:38:47] <calvinow> The score calculations are lightning fast right now, and I want them to stay that way.
[19:39:31] <kali> calvinow: the documents have a 16M limit
[19:40:04] <kali> calvinow: do you denormalize the score ?
[19:40:22] <jaimef> kali: you resync through mongo with empty dir? or you copy it over from other rs?
[19:40:35] <calvinow> kali: Yes, but that can be changed at compile time, and once this project I'm on gets off the ground I'd likely be maintaining my own source tree for the servers, so that's not much of a concern
[19:40:48] <kali> jaimef: i stop the instance, rm -rf in the dir, and start again
[19:41:26] <calvinow> kali: The score has to be calculated - it's based on a currently selected subset of nodes. Storing it would be n! complexity...
[19:41:41] <kali> calvinow: ok
[19:41:55] <jaimef> kali: and that will reduce space usage right assuming sparse data. ok thanks.
[19:42:06] <kali> jaimef: it should
[19:42:53] <jaimef> kali: ok, thought I remembered it working that way. thanks
[19:43:18] <calvinow> kali: I mean, I'm just playing devils advocate here... I think for the purposes of that calculation, everything being in the same doc is a big win.
[19:43:36] <calvinow> I'm just not sure if that benefit is outweighed by the overhead of potentially large documents
[19:44:15] <kali> calvinow: well... i don't know the size of your dataset. if you're sure you're not likely to hit the 16M limit, then why not
[19:44:42] <kali> calvinow: also... updating the relations could be a pain if the documents are huge
[19:45:38] <calvinow> kali: I'm talking terabytes and terabytes. here... massive amount of data. The documents _could_ be larger than 16M, but it would be rare... I'd probably change that to 64M or suchlike just to be safe.
[19:46:20] <calvinow> I can fold all the updates for each document into a single update() query without a problem... so I think that works out okay
[19:46:21] <kali> calvinow: i'd rather plan for 10k or 100k pages, honestly
[19:46:58] <kali> calvinow: i assume most docs will fit in the first page, and it avoid having to patch
[19:50:29] <calvinow> kali: hmm, ok. Thanks a bunch for the insight.
[20:08:32] <nyxtom> Does anyone know how to solve this problem I'm having with large indexes and having nscanned be much larger than n? https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/P4Tb9bcx5OU
[21:06:42] <strnadj> ohh, you want to object_id of object which are changed by your update command?
[21:07:28] <macpablo> i know the id, but i want to know if it was an insert or an update
[21:08:40] <macpablo> http://php.net/manual/en/mongocollection.insert.php i want to get the upserted value of that array
[21:08:50] <macpablo> but i don't know where is that
[21:09:52] <strnadj> :( now, Iam not sure if i can help you :/
[22:18:05] <jaimef> is ext4 preferred over
[22:18:08] <jaimef> say xfs?
[22:26:22] <forrest__> http://docs.mongodb.org/manual/administration/production-notes/#mongodb-on-linux
[22:36:46] <jaimef> forrest__: thanks
[22:50:58] <dudebro> hey guys, cross-posting from #mongoengine since it isn't really specific to that driver - running into a problem with linking documents.  say we have three types of objects that are related, for example customers, jobs and employees.  jobs are associated with a customer and an employee so in this case it seems fine that a job contains a reference to each of those entities.  for customers and employees however, the jobs associated list could/should grow
[22:51:23] <dudebro> it seems like the latter is the "mongo way", but there doesn't seem to be any reason to not have a join table as well apart from the atomicity of writes
[22:56:23] <jesta> dudebro: so whats the problem exactly? :)
[22:57:23] <dudebro> jesta: just trying to figure out if there's a common model for this type of interaction. i suppose even with references stored in the user/employee objects there's a lot i can store before running into document size limits
[22:58:04] <jesta> Lemme write out real quick how I'd envision it
[23:01:38] <jesta> dudebro: I would just make it so that the Jobs reference the Company and Employees involved, and don't store any refs on Employees and Customers - https://gist.github.com/4534362
[23:02:34] <jesta> Unless the Employees are employees of the customers, then you might want to ref them together
[23:02:40] <jesta> but you don't need to ref everything to everything
[23:02:47] <dudebro> jesta: so in that scenario, say i want to look up all jobs for a customer i just query the jobs collection itself to match on the customer ref?
[23:03:30] <jesta> db.jobs.find({_customer: DBRef('customers', ID)});
[23:03:38] <jesta> you can use references to query
[23:04:12] <jesta> that would find all jobs where the customer reference is set to that customer
[23:07:33] <dudebro> thanks jesta, i can see that would be better since writes would only have to happen in one place, i guess i'll just wait until i'm super huge to worry about the relative performance. (this seems better than having a single user or employee grow huge though, so it's probably better all around)
[23:09:41] <jesta> dudebro: np np, and honestly the jobs shouldn't get that huge just due to the fact that they always only reference a company and maybe a few employees. Then ya just create a new job, which gets new references :) Much better than storing it on the employee or the customer, cause yeah, they'd grow to be huge.
[23:22:16] <solars> hey, is it possible to replace the 2.2 binaries with the dev version 2.3 currently for testing?
[23:28:54] <solars> which is v2.2.2 vs v2.3.2