PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 10th of July, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:32] <nemothekid> We are upgrading our mongo server to support sharing. We currently have two shards. Our initial database has about 200 gigs of data, one collection has almost 2000 chunks. Should we just let mongo do its thing or is there a way we can make this go faster (presplitting?)
[00:00:32] <nemothekid> )
[00:00:50] <jstout24> what's the best way to increment a float value? (i.e. revenue)
[00:06:20] <jstout24> hmm, i just realized you can use $inc with floats
[00:13:49] <DigitalKiwi> Using salat with a model like this http://dpaste.com/768801/ does it store a reference to for instance the adder field, or does it store a copy of whatever was in that model when it's created?
[00:17:20] <ptmang> hi guys
[00:18:01] <ptmang> what do you recomend/use for the max proc ulimt for the mongo user
[00:18:58] <ptmang> i know that we want to set the max open files ulimt high, but what about the max proc limit?
[00:20:52] <nemothekid> We are upgrading our mongo server to support sharing. We currently have two shards. Our initial database has about 200 gigs of data, one collection has almost 2000 chunks. Should we just let mongo do its thing or is there a way we can make this go faster (presplitting?)
[01:01:42] <glaw> In a sharded collection, is it possible to find what shard key ranges belong to each shard?
[03:01:36] <skot> yes, it it stored in the chunks collection of the config db.
[03:02:06] <skot> ptmang: yes, you should set this high too.
[03:02:19] <skot> It will affect the number of threads which can be created.
[04:23:03] <domo> is the global write lock still an issue in mongodb? I'm asking because I'm currently using mongodb on a single server
[04:23:48] <duncan_donuts> is there a file-base config parameter to specify the priority of a node in a replset?
[04:34:58] <skot> no, priorites are set in the configuration document (rs.config()) only
[04:35:43] <skot> domo: it is debatable how much the reader/writer lock is an issue in practice but in 2.2.0 there will database level locks.
[04:39:21] <domo> skot how does that affect performance for a single server
[04:39:36] <ron> database level locks? I thought you're going for collection level locks.
[04:41:42] <skot> 2.2.0 will be db level, future will be lower
[04:41:56] <skot> domo: depends on your throughput.
[04:42:18] <skot> generally the lock is not the limiting factor for most people, the disk/memory-size is.
[04:42:40] <duncan_donuts> skot: thanks
[04:42:58] <skot> There is also logic which checks to see if a document that you need to update is in memory, and if not it gets it into memory outside of the lock
[04:43:12] <domo> skot: what kind of throughput should you start worrying at?
[04:43:23] <domo> right now im maybe doing 300-500 writes per second
[04:43:42] <skot> probably somewhere north of 50-60K write operations per second
[04:43:45] <domo> lol
[04:44:06] <domo> write operations in this context == incrementing a value right?
[04:44:13] <skot> any document update
[04:44:16] <domo> ok
[04:44:32] <skot> could be a single field or a whole document, or insert or delete
[04:44:51] <domo> and once you start hitting the barrier, you can solve it by introducing sharding, correct?
[04:44:56] <skot> and the number of indexes are a factor as well, as are many other things.
[04:45:05] <skot> yes, you can.
[04:45:10] <domo> i mean, i know i should be doing replication regardless
[04:45:14] <skot> yes
[04:45:28] <skot> It also depends on the speed of your cpu/disk
[04:45:31] <domo> yeah
[04:46:03] <skot> so there is no one answer but in general you should be able to push many tens of thousands of writes on a single instance
[04:46:14] <domo> ok great info
[04:46:31] <skot> as I said, the lock is generally not the issue.
[04:46:40] <domo> then why is it so complained about
[04:46:51] <domo> everywhere i read every blog/article says "mongo's write lock sucks"
[04:46:55] <skot> because it seems like a big deal, but all database have a single lock
[04:47:10] <skot> lack of understanding, knowledge, or thought.
[04:47:42] <skot> don't quote me on that.
[04:47:47] <domo> i wont
[04:47:54] <domo> because the channel already did
[04:47:58] <domo> ;)
[04:48:14] <domo> haha but thanks for the info
[04:50:23] <DigitalKiwi> found the answer to my question....
[04:50:37] <DigitalKiwi> had to restructure layout :(
[04:51:35] <DigitalKiwi> but the same query results are faster than it was before so i guess it was for the better...
[06:01:08] <jstout24> I'm watching http://www.10gen.com/presentations/mongonyc-2012/real-time-data-analytics and there's a slide where the _id is a hash
[06:01:32] <jstout24> _id: { p: *some date time*, type: 'SomeType' }
[06:02:42] <jstout24> what's the benefit in doing that vs just having the id be p concatenated with type
[06:03:13] <jstout24> or an objected then a date time & type field within the document
[06:03:25] <jstout24> and is there anywhere i can read up on this type of key?
[06:20:26] <DigitalKiwi> hmm, but now I don't know how (or if I can) to orderby on the field I want since it's part of a linked document...
[06:23:36] <DigitalKiwi> http://dpaste.com/768899/ structure, I want to be able to order a list of bookmarks by the userId._id field in the linked user object
[06:34:37] <pshr> Hi, I have a doubt, I am using mongodb java driver to connect to database from my java application with the following properties mongo.safe.connectionsPerHost=1 mongo.safe.threadsAllowedToBlockForConnectionMultiplier=20 mongo.safe.connectTimeout=10000 mongo.safe.maxWaitTime=15000 mongo.safe.autoConnectRetry=true mongo.safe.socketKeepAlive=true mongo.safe.socketTimeout=60000 mongo.safe.slaveOk=false mongo.safe.writeNumber=0 mongo.safe.w
[06:34:51] <pshr> mongo.safe.writeFSync=false
[06:35:22] <pshr> I keep getting intermittent exceptions saying that "Connection wait timeout after 15000 ms; nested exception is com.mongodb.DBPortPool$ConnectionWaitTimeOut: Connection wait timeout after 15000 ms"
[06:36:03] <pshr> Can any one please throw some light on the issue and also keeping number of "connectionsPerHost=1" is a good idea or a bad idea and is it the one that is causing this error.
[06:38:16] <pshr> here's the pasted properties http://pastie.org/4230106
[06:44:15] <DigitalKiwi> also, what's the difference between 1 and -1 when defining the index? one is for asc and the other desc, but they seem to return just as fast in both directions even with an index on just 1?
[07:24:07] <tiripamwe> hi guys do all databases grow to match the largest database in a mongod instance?
[07:24:46] <horseT> Where can I found the accepted character list for a dbname ?
[07:25:26] <NodeX> a-zA-Z0-9_
[07:25:49] <NodeX> why would you ever want to use anything else lol
[07:35:42] <[AD]Turbo> hola
[07:38:27] <rydgel> hey guys got a question. Is there a way to find what the query is if all I got is the Cursor ID?
[10:27:18] <horseT> NodeX: Thanks
[10:56:28] <Miljar> has anyone ever had the problem where you query for a document with an array of subdocuments, the result shows that the array of subdocuments only has 1 item in it, while it should have more items?
[10:57:13] <Derick> you should post your code somewhere :)
[10:57:27] <Miljar> http://pastebin.com/ErsgNuaV :)
[10:58:28] <NodeX> that's just the results
[10:58:32] <NodeX> please paste the query
[10:58:38] <Derick> and the insert...
[10:58:50] <NodeX> +1
[10:59:13] <Miljar> hang on
[11:28:38] <Miljar> odd... when I was cleaning up my data, inserting it and then doing a find on it, everything worked as expected
[12:30:55] <xy77> What's the difference between a database and a collection?
[12:34:41] <mids> xy77: are you familiar with relational databases?
[12:34:50] <xy77> yes
[12:34:55] <mids> a collection is like a table
[12:34:59] <mids> a database is like a database
[12:35:39] <deoxxa> lol
[12:36:43] <xy77> Okay, where do I find the databases on the mongodb console? in db. I find only collections.
[12:37:02] <mids> show dbs
[12:37:23] <deoxxa> if only there was some way to find this stuff out
[12:37:32] <deoxxa> like if someone wrote it all down in some kind of collection of documents
[12:37:40] <deoxxa> that would be so cool
[12:38:10] <mids> deoxxa: on something like the internet?
[12:38:59] <deoxxa> yes, some kind of interconnected network, yes that would do nicely
[12:39:26] <deoxxa> now we just need to figure out a way to locate relevant information based on a small amount of known information
[12:39:34] <deoxxa> like a keyword of sorts... some method of "searching"
[12:39:41] <mids> oh wait!
[12:39:45] <xy77> I somehow figured everything worked object style db.something(); use test was just to easy.
[12:39:45] <deoxxa> !!!
[12:39:46] <mids> lets call this thing a 'search engine'
[12:39:55] <deoxxa> you're a genius, mids.
[12:40:11] <deoxxa> a genuine, bona fide, first class genius
[12:40:25] <mids> yeah, there must be a business plan there
[12:40:40] <deoxxa> alas, it is nigh impossible
[12:41:08] <mids> because there is no database that is web scale to store everything in?
[12:41:23] <deoxxa> rats!
[12:41:25] <deoxxa> foiled again
[12:42:19] <mids> xy77: just ignore us while deoxxa and I play silly elitists
[12:42:57] <deoxxa> known by those pressed for time as simply "silletists"
[12:43:54] <xy77> You bet. How about one of you two make sure a link to the tutorial gets added at the end of the install guide?
[12:44:38] <xy77> or add use <db> to the DOCS/Databases page?
[12:45:54] <deoxxa> sure lemme get right on that
[12:50:55] <mids> xy77: added as comment to http://www.mongodb.org/display/DOCS/Commands
[13:31:34] <pnh> Hi All, I'm trying to make a query which returns number of fields in a collection. Ie, I have a collection called users. and it has variable no of fields like gender, name, age etc. Now I want to know what are all the fields are present in this entire collection..how can i make this type of query?
[13:32:10] <pnh> I want to query field name, not its value.
[13:33:22] <skot1> there is not such thing.
[13:33:40] <skot1> each document can have different fields and the only way to get a list of them would be to inspect each document
[13:33:52] <skot1> s/not/no
[13:34:21] <pnh> oh... i see... I can't query for a field name using regex ?
[13:34:51] <pnh> like get me all documents with fields having keywords ag*
[13:35:06] <mids> maybe you want to have a different schema to do what you are trying to do
[13:35:07] <skot1> sure, but the docs which come back will contain the fields and values
[13:35:49] <mids> if you have {attr:{name:'agora',value:10}}, you can put an index on that and search on the name of the attribute
[13:36:24] <pnh> oh.. okay fine... I also heard that we can get the key name from map reduce... http://stackoverflow.com/questions/2298870/mongodb-get-names-of-all-keys-in-collection
[13:36:50] <mids> pnh: what is your ultimate goal?
[13:37:26] <pnh> I just want to know all the users with age attribute , gender attribute etc..
[13:37:33] <pnh> value doesn't matter..
[13:38:26] <mids> per document you could store a set with the attribute names that you store
[13:39:12] <mids> then you can do an $in / $all query on that set to get matching users
[13:39:21] <skot1> I think you want to use $xists
[13:39:36] <skot1> $exists I mean, to indicate you want docs with some field
[13:40:00] <lotia> can you have a shard with a single replica set/server? Or will it simply not work. I'm considering doing it that way so I can add more servers/replicasets when the need arises.
[13:40:31] <pnh> mids, skot1 : oh... okay fine... will explore more about it... thanks :)
[13:40:51] <mids> skot1: ah yeah $exists, iirc that had some performance issues but might be resolved
[13:41:11] <skot1> lotia, yes you can have shards which are not replica sets
[13:41:22] <pnh> but I'm not aware of which attribute is exist... it's totally variable...
[13:41:46] <pnh> document can have any filed, which I don't know...
[13:41:55] <pnh> then how can I use $exist?
[13:41:56] <skot1> mids: conceptually indexes are not good at finding then things aren't there but performance is better now
[13:42:10] <mids> skot1: kewl
[13:42:22] <skot1> pnh: if you can't phrase the query, I can't offer much help.
[13:42:38] <skot1> How will you get field names in the query if you don't know the field names?
[13:42:57] <skot1> I assume what you are saying is that your program won't know, but the user will, correct?
[13:43:33] <pnh> skot1: exactly.. user sends some data which I store in my mongo.. I never know what are all the attributes are present..
[13:44:00] <pnh> that's what I'm trying to do now.. i want to know which are all the attributes exist in a given collection...
[13:44:29] <pnh> there is not predefined set of fields.. user can send any attribute
[13:44:49] <mediocretes> pnh: you'd have to walk the collection
[13:45:15] <pnh> but it's too huge... 1 million document in that collection..
[13:45:18] <skot1> I would suggest keep a document somewhere which contains that list of field, as you read and write docs make sure the list is accurate and you can use the list to show options for users to search on.
[13:45:41] <skot1> At first the list will be empty, but as the system gets used it will be full
[13:46:05] <skot1> you can start with a few docs from the collection to seed the "cache" of schema fields
[13:46:10] <skot1> you need to do this manually.
[13:46:13] <lotia> skot1: but can I have a sharded setup with a single replica set so I can add more later?
[13:46:28] <skot1> lotia: yes
[13:46:57] <pnh> skot1: won't it create any performance issues as the collection grows.. my collection size may grow up to 100 Million document...
[13:47:36] <lotia> skot1: thanks
[13:48:25] <pnh> 1st I'll try what you said, then get back to you... thanks a lot :)
[13:49:11] <mids> heh, does the tshirt help?
[13:49:15] <mids> maybe I should get one too
[13:49:28] <pnh> nah.. but it gives some inspiration...
[13:49:32] <pnh> :)
[14:08:44] <jiffe98> I have a collection which is going to be insert and read only, no updates, each individual record won't be that large, but there will be a lot of them. Is there a good way of compressing this collection ?
[14:22:46] <neil__g> i have a secondary with the rssync op as active:false - it's getting more and more behind - what can i do to kickstart it?
[14:39:06] <neil__g> anyone :(
[15:16:46] <Bartzy> Why slaves for reads only are good for scaling? I mean, if the data fits in RAM - It will be fast even on a single server that does both writes and reads. If data doesn't fit in RAM - it won't fit on slaves as well, and performance will be bad, until you shard, so less stuff needs to go on each shard's RAM. So what read slaves are good for ?
[15:20:29] <NodeX> err
[15:20:50] <NodeX> it's not just CPU / RAM that are bottlenecks
[15:30:40] <jiffe98> Is something like this possible? http://nsab.us/public/mongo
[15:30:57] <jiffe98> On the top I have a shareded replica sets where c2 is shareded, c1 is not
[15:31:15] <jiffe98> On the bottom I have a couple slaves which pull from the master's c1 collection
[15:58:36] <jiffe98> looks like this should be possible, the only thing I see as being a problem is being able to slave off a replica set ?
[15:59:12] <jiffe98> maybe setup mongos locally and replicate off that?
[17:05:30] <lotia> how do i find out what shard member an unsharded collection lives on?
[17:19:54] <skot1> It lives on the primary for that database.
[17:20:08] <skot1> All unsharded collections in a database live on one shard.
[17:20:44] <skot1> If you run sh.status() it will list all databases and their primary/home shard.
[17:33:58] <mh512> Hi, it seems to that with 90% read, 10% read-mod-write requests, mongo perf drops dramatically once the data can't fit in RAM.
[17:34:27] <mh512> e.g. with a 1kb value size, I am only getting 3.5k ops/sec via the ycsb benchmark tool.
[17:35:34] <jY> bad indexes?
[17:35:40] <jY> or missing indexes?
[17:35:50] <mh512> using _id
[17:36:20] <mh512> on ssd
[17:37:08] <mh512> this is single machine, single instance though
[17:37:14] <mh512> i have not tried sharding
[19:58:35] <droud> Hi guys...I'm having an issue with an index performing very badly and was wondering if I'm doing something wrong.
[19:59:17] <droud> The index has an array of NumberLongs, a float, and objectids in it.
[19:59:28] <droud> When I query against the NumberLong and the float, performance is great.
[19:59:43] <droud> But when I add the objectid as a sort parameter, it's abysmal.
[19:59:55] <droud> Difference between 50ms and 30m.
[20:00:22] <droud> ~17M objects, it's a sparse index, and the explain() for both the sorted and unsorted query is the same.
[20:00:37] <droud> (with the exception of the orderby of course)
[20:01:07] <droud> Average nobject is about 3000 which seems minimal.
[20:01:56] <droud> I'm just confused about why the sort on objectid, which is in the index, is so much slower? The fields are ordered in the query and the index, with objectid last.
[20:02:14] <mids> ascending or descending sort?
[20:02:29] <droud> All ascending, although I experimented with descending on the ID because I want most recent.
[20:02:58] <droud> No difference in performance either way, although building the index with descending objectids tends to create a larger index by nearly 20%.
[20:03:08] <kali> droud: you're aware index can only work if you're requesting the "left" of them ?
[20:03:27] <droud> kali: Yes.
[20:04:02] <droud> Index: "group":1, "ranking":1, "_id":1
[20:04:31] <kali> droud: and querying group and ranking, sorting on _id ?
[20:04:45] <droud> Query: {"group":123, "ranking": {$gt:123}}, sort({"_id":-1})
[20:05:05] <kali> ha ! the sort can't use the index in that case
[20:05:07] <droud> I've also tried index with "_id":-1.
[20:05:19] <droud> Ok, what can I do to fix that?
[20:05:41] <kali> it will work if you do group: 123, ranking : 123 and sort _id: -1
[20:06:09] <droud> Hrmm...I do need at least an exists on the ranking.
[20:06:16] <kali> depending on the cardinalities, and index on group and _id (in that order) may help
[20:06:17] <droud> I can do a range, too?
[20:06:36] <duncan_donuts> I have a stale replica set config and want to start over but in what I've read/tried it doesn't look like its possible.. is there a way to ditch an existing rs config and start over?
[20:07:06] <droud> Well, it would help but some of these groups have half a million objects and the rankings are sparse.
[20:07:11] <kali> duncan_donuts: if usually rm -rf * in the dbpath
[20:07:42] <duncan_donuts> kali: I don't want to drop my entire db, just the replica set config
[20:08:11] <kali> duncan_donuts: try dropping "local*", then (or move it away)
[20:08:46] <duncan_donuts> kali: ah ok.. so that's where the op log is?
[20:08:57] <kali> droud: not sure what you can do then :/
[20:09:10] <kali> duncan_donuts: oplog, replicaset metadata
[20:09:29] <duncan_donuts> kali: thx!
[20:12:20] <droud> Hrmmm...I guess I'll take care of it in application layer, and combine the group and ranking into one field. :-/
[20:12:24] <droud> Thanks kali.
[20:13:09] <kali> droud: i don't think combinig the fields will work
[20:13:33] <droud> kali: I just need exists on the ranking field.
[20:13:44] <kali> droud: index are efficient when you can sort everything in a linear manner and querying on intervals of this line
[20:13:54] <kali> ha !
[20:13:55] <droud> If I could create an index with $exists I'd be set, I believe.
[20:14:14] <droud> I thought that the sparse index would help with that, honestly.
[20:14:15] <kali> or a has_ranking boolean, then
[20:17:38] <becksebenius> hello, I'm getting a TypeInitializationException error when trying to use the mongodb drivers in c#. Does anyone have a moment to help me troubleshoot?
[20:18:15] <droud> kali: I noticed that index building shows an empty query, is there a way to put the exists clause into that query to limit the index to only contain those documents?
[20:18:36] <droud> (and would that persist?)
[20:18:59] <kali> droud: nope, index only work on values
[20:19:29] <droud> Makes me wish I could filter documents before indexing!
[20:22:38] <tystr> having some trouble with 10gen's mms monitoring… I installed the agent and added the host, but there's no data for the host
[20:42:39] <duncan_donuts> is it normal for the initiating node of a replica set to be SECONDARY?
[21:01:45] <linsys> yes
[21:02:18] <duncan_donuts> until the other nodes are caught up?
[21:03:03] <linsys> you said "initating" so while the other nodes are getting the config and can actually vote there isn't enough votes to nomiate a node master
[21:04:37] <duncan_donuts> ok, but its taking a long time for the remote nodes to initiate… they already seem to have the config.. so why can't the vote? Is getting all the data part of initialisation also?
[21:07:21] <linsys> Not sure what you are asking... what is the status of the other nodes when you do an rs.status()
[21:07:44] <duncan_donuts> sorry I'm having trouble explaining...
[21:07:57] <duncan_donuts> the status of the remote nodes "still initializing'
[21:08:20] <duncan_donuts> has been for about 40mins
[21:09:36] <listerine> is it possible to use jquery/underscore on mongo console?
[21:10:44] <duncan_donuts> linsys: and the statestr is "UNKOWN"
[21:15:00] <linsys> then that is why your main node is still secondary... you might want to tail the logs on the other nodes and see what is going on
[21:15:48] <duncan_donuts> linsys: I am tailing the other logs and seeing this:
[21:15:53] <duncan_donuts> runQuery called admin.$cmd { replSetElect: 1, set: "haystack", who: "10.10.5.91:27017", whoid: 1, cfg….
[21:16:19] <duncan_donuts> that IP is the IP of the node I want to be primary
[21:18:02] <linsys> thats all that is in the log? can this node hit 10.10.5.91.2707 if you do mongo 10.10.5.91 can you connect?
[21:18:06] <linsys> from this other host?
[21:19:00] <duncan_donuts> uh-oh.. I think you just gave me a clue
[21:37:00] <duncan_donuts> linsys: thanks for your help.. dns issue
[21:38:43] <linsys> np
[23:19:10] <Epona> can someone help me with the yum installation on centos?
[23:21:31] <Epona> I get this error
[23:21:32] <Epona> Package mongo-10gen-2.0.6-mongodb_1.x86_64.rpm is not signed
[23:21:44] <Epona> I dont see any hint of it anywhere online
[23:38:57] <tystr> is there a way to specify slaveok in the actual query expression without having to issue rs.slaveOk() ?