pmxbot IRC Log Viewer

[00:01:17] <cofeineSunshine> joannac: ok. :) feels better noaw

[00:18:18] <cirwin> if I do something like: db.errors.update({status: null}, {status: "open"}, {multi: true}), is it possible for that to clobber concurrent updates that also set the status?

[00:19:47] <cheeser> i don't think so because of the database level locking

[00:21:50] <joannac> well

[00:22:07] <joannac> depends how they come in

[00:22:13] <joannac> and which one wins

[00:23:09] <cheeser> well, sure. but given this update, that order wouldn't *really* matter.

[00:25:03] <joannac> do multi-updates yield?

[00:25:54] <cheeser> i dunno

[00:27:03] <joannac> okay, so I guess the question is, if another update comes in db.errors.update({_id: whatever, status: null} {status: "pending"})

[00:27:06] <joannac> which would win?

[00:27:16] <joannac> and the answer is "it depends" :)

[00:27:26] <cheeser> pretty much

[00:27:33] <joannac> that's all I wanted to add

[00:33:31] <cirwin> ok

[00:33:34] <cirwin> that's what I wanted to know

[00:33:35] <cirwin> :(

[00:33:55] <cirwin> does $isolated fix that?

[05:49:37] <alirezataleghani> HI

[05:49:59] <alirezataleghani> I have been faced with a problem on my productions server...

[05:50:22] <alirezataleghani> my disk space was over 95% in use

[05:50:45] <alirezataleghani> so I added som new SATA disk (main disks was SSD)

[05:51:07] <alirezataleghani> and start a tag awarded sharding (Tire 2 Style)

[05:51:12] <alirezataleghani> http://pastie.org/9300939

[05:51:40] <alirezataleghani> there are a lots of chunks moved over archive shard! write know but no disk space is released!

[05:52:13] <alirezataleghani> also I dropped unneeded index to free some space! (~15GB)

[05:52:19] <alirezataleghani> but nothing happed :-/

[07:21:10] <thomastuts> hey guys, anyone around? i have a question regarding schema modelling for mongoose (on a conceptual level)

[07:21:55] <kali> thomastuts: it will probably get more traction if you can ask it in pure mongo terms

[07:22:36] <thomastuts> well basically i have two collections: users and packages - they have a many-to-many relationship

[07:22:45] <thomastuts> users can be either a 'user' or an 'admin' role

[07:23:08] <thomastuts> i was thinking of storing users and admins in package.users and package.admins respectively so i know who has which role

[07:23:31] <thomastuts> the problem i'm running into is how do i link this relationship to the users collection

[07:23:53] <thomastuts> should i just use one property, e.g. user.packages that has all packages, regardless of role?

[07:23:57] <kali> how big are the two "many" ?

[07:24:24] <thomastuts> sorry, what do you mean with big? do they have a lot of properties etc?

[07:24:37] <thomastuts> or do users have a ton of packages, and vice versa?

[07:24:39] <kali> more like... what kind of number we are talking about

[07:24:43] <kali> yeah

[07:24:45] <thomastuts> ah

[07:24:55] <thomastuts> well, let's say a user usually has 2-4 packages

[07:25:04] <thomastuts> 4 is quite a lot already to be honest

[07:25:13] <thomastuts> and let's say a package will have up to 5-10 users at most

[07:25:37] <kali> ok

[07:26:13] <kali> i would do user: [ id: kali, role: user ], {id: thomas, role:admin }, ... ] }

[07:26:36] <kali> wait :)

[07:26:44] <kali> i need to wake up

[07:26:59] <kali> users: { [ id: kali, role: user }, {id: thomas, role:admin }, ... ] }

[07:27:01] <kali> that's better

[07:28:19] <thomastuts> that's a great idea actually

[07:28:46] <thomastuts> then i can populate that object when i retrieve a package and merge the 'user' object together with the role field

[07:29:19] <kali> i'm not sure i understand your last statement :)

[07:29:36] <thomastuts> and you'd store a user's packages like user.packages: [_id, _id, _id, _id], correct?

[07:30:11] <thomastuts> well, when i retrieve a package i would like to populate the users collection with actual user objects instead of just their _id, but that's a mongoose-specific question so it's not really relevant here

[07:30:35] <kali> thomastuts: ho, i get the statement. the role is actually a user system-wide characteristic, not part of the actual user-package relation, right ?

[07:31:33] <thomastuts> no, it's part of the package/user relation

[07:31:43] <kali> thomastuts: that will work, but i would actually do a user.packages: [ { _id: ... }, { _id: ... } ] because it is likely that at some point, you'll need some package property in there too

[07:31:45] <thomastuts> a user can either edit/delete a package, requiring admin privileges

[07:31:54] <kali> ha, ok

[07:32:34] <thomastuts> would you recommend storing every collection reference like that? [ { _id: ... }, { _id: ... } ] vs [_id, _id, _id]

[07:32:41] <thomastuts> for example a package also has brands

[07:33:04] <thomastuts> you'd store them inside an object in the array instead of using the objectid itself, so you can add more metadata to something if it's ever needed, correct?

[07:33:09] <kali> thomastuts: when the array stay smalls (and yours are), yeah, absolutely

[07:33:19] <thomastuts> sorry for these basic questions, still getting the hang of mongodb and modelling etc

[07:34:17] <kali> you're welcome

[07:35:36] <kali> ("no problem" would be more appropriate, but i don't think i would pass turing's test this morning)

[07:40:43] <thomastuts> kali: right, i'll let all this sink in a little and then see what i can come up with :) thanks a bunch for the feedback, really appreciate it!

[08:49:00] <jayesh> hello

[08:50:06] <jayesh> can anyone help me with mongodb and elasticsearch integration please

[11:31:29] <bcsantos> hello all

[11:32:18] <bcsantos> i'm trying to get carrierwave uploader to work in padrino with mongoid + gridfs.

[11:32:54] <bcsantos> getting "NoMethodError at /admin/artworks/update/53a0eedcf2c7961066000002 undefined method `bson_dump' for # file: hash.rb location: block in bson_dump"

[11:33:14] <bcsantos> more here >> http://stackoverflow.com/questions/24203977/carrierwave-mongoid-gridfs-padrino-admin-image-upload

[11:57:35] <bcsantos> (re)

[11:57:41] <bcsantos> i'm trying to get carrierwave uploader to work in padrino with mongoid + gridfs.

[11:57:44] <bcsantos> getting "NoMethodError at /admin/artworks/update/53a0eedcf2c7961066000002 undefined method `bson_dump' for # file: hash.rb location: block in bson_dump"

[11:57:48] <bcsantos> more here >> http://stackoverflow.com/questions/24203977/carrierwave-mongoid-gridfs-padrino-admin-image-upload

[13:18:02] <richthegeek> hey, has anyone got a few minutes to answer some questions about sharding?

[13:19:06] <richthegeek> basically we have each client in a separate database, and currently all those databases are in the same replica set. Eventually, those servers are gonna run out of space and more importantly, out of processing room

[13:19:40] <richthegeek> so what I was hoping to do was have each database automatically sharded (ie, all collections in that DB live in the same physical location(s))

[13:20:37] <richthegeek> more particularly, although this might be mad, is that each database resides on 2/3 of the servers in the N-server cluster - ie, "Bob" is an AB, "Alice" is on CE, etc...

[13:43:06] <rspijker> richthegeek: you can do that sort of stuff, although probably not automatically

[13:43:42] <rspijker> maybe through tagging

[13:45:26] <richthegeek> rspijker: yeah i think we'd prefer to do it manually so we can balance processing allocation against workload-size

[13:50:18] <rspijker> richthegeek: if you want a single DB to always reside on a single shard, then you could just set the primary shard for each DB and just not enable sharding on any of the collections

[13:50:25] <rspijker> see: http://docs.mongodb.org/manual/core/sharded-cluster-shards/#primary-shard

[13:51:01] <richthegeek> rspijker: i see, and then i guess i could just use MMS for the backup (which is what the RS is used for at the moment)

[13:51:39] <rspijker> richthegeek: for backup, sure… but the RS is used for more than backup… it’s used for failover…

[13:51:53] <rspijker> in production you’d really want each shard to be a replica set

[13:52:45] <richthegeek> rspijker: at the moment the db lives on the same server as the processing engine (it's a web analytics platform) so if the db goes down then it doesn't make much difference

[13:54:03] <richthegeek> rspijker: presumably in future we'd keep it that way, as the database storage size seems to be less of a restriction than the CPU usage, so failover is handled further up the chain (transactional event submission through the foremost API)

[13:54:08] <rspijker> well… Then all you need to do is decide for yourself what backups are for. Disaster recovery, rollback in time, … ?

[13:54:34] <rspijker> And then respond appropriatly

[13:54:47] <richthegeek> at the moment, disaster recovery (ie, the main server melts) .. ideally rollback in the future which is something more complex than just replication delay

[13:55:50] <richthegeek> might be getting to the point where we hire an actual devop with mongo experience....

[13:58:17] <rspijker> well, those are the types of consideratinos that should drive the decision on whether you need a RS, whether you need to do backups only through an RS/MMS, or also with FS snapshots, etc.

[14:00:41] <richthegeek> well we've got a few months breathing room before it becomes something we seriously need to consider, and we can always just scale the servers up ... thanks for the help!

[14:06:18] <Skunkwaffle> I know I can modify settings using ini_set(), but there doesn't seem to be an ini_get() function. Is there any way to check the value of a setting from the mongo shell?

[14:14:40] <rspijker> Skunkwaffle: it looks like you are confusing mongo and php… ?

[14:15:08] <Skunkwaffle> whoops, you're right

[14:16:08] <Skunkwaffle> is there any way to check the value of a setting from a php script?

[14:18:26] <rspijker> I don’t know Skunkwaffle. QUestion is probably better suited for #php

[14:18:42] <rspijker> but I’m guessing there will also be a bout a million SO questions about it

[14:22:07] <Skunkwaffle> Already checked SO, didn't see anything. I'll chec #php though, thanks

[14:35:20] <jayesh> hello

[14:35:45] <jayesh> i am trying to integrate mongodb and elasticsearch from morning no success

[14:36:02] <jayesh> can any one suggest with good blog or tutorial

[14:36:08] <jayesh> Please

[14:39:20] <Nodex> there is plenty on google, just search for "Mongodb and elsatic search"

[14:39:24] <Nodex> elastic *

[14:39:39] <Nodex> else try "It's not rocket science + 42"

[14:39:45] <jayesh> yes i have done that

[14:39:50] <jayesh> was using this http://satishgandham.com/2012/09/a-complete-guide-to-integrating-mongodb-with-elastic-search/#comment-3626

[14:41:20] <Nodex> what doesn't work?

[14:41:35] <jayesh> i installed the plugins

[14:41:45] <jayesh> create the node in elastic search

[14:42:13] <jayesh> but when i do curl -XGET 'http://localhost:9200/mongoindex/_search?q=firstName:John'

[14:42:46] <jayesh> it gives me {"took":9,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

[14:42:58] <jayesh> i cant get the data

[14:43:02] <Nodex> what does q=firstName:* give?

[14:43:35] <jayesh> {"took":9,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":0,"max_score":null,"hits":[]}}

[14:43:50] <Nodex> then there is no data in there, check your elastic search logs to see why

[14:44:05] <jayesh> m using elastichead for that

[14:44:14] <jayesh> it shows 0 docs

[14:44:27] <jayesh> but the data is there in my mongodb

[14:45:48] <Nodex> did you setup your mongodb as a replica set?

[14:45:55] <jayesh> yes

[14:46:12] <jayesh> when i get into mongo shell

[14:46:22] <Nodex> this is probably more a question for #elastic-search

[14:46:35] <jayesh> it shows rs0:PRIMARY

[14:46:44] <jayesh> okay

[14:46:52] <Nodex> it might be #elasticsearch

[14:46:55] <Nodex> can't quite remember

[14:47:08] <jayesh> thank you

[14:47:12] <Nodex> no probs

[14:47:15] <jayesh> no worries will find it out

[15:54:56] <vonnieda> Hi folks. I have some confusion on when/how clients can read from secondaries. I have a 3 member replset. Client is connecting using all three in the connection string and slaveOk=false. If I remove the master from the connection string it still reads but if I then remove one of the secondaries it fails with "not master and slaveOk=false". I would have thought it would fail with that after removing just the primary. Can someone tell me why?

[15:56:11] <vonnieda> So in other words: m+s+s = okay, m+s = okay, s+s = okay, s = no

[16:04:46] <elfotografo007> Hi vonnieda

[16:05:02] <elfotografo007> A primary may be elected if there is a majority in the replicaset

[16:05:39] <elfotografo007> when you remove a primary in a 3 member replica set, there is majority (2 of 3)

[16:05:44] <elfotografo007> and a new primary is elected

[16:06:08] <elfotografo007> then if you remove the secondary, there is no majority (1 of 3)

[16:06:36] <elfotografo007> and the primary, alone, detects that there is no majority, so it will step down as a secondary to prevent rollback

[16:08:59] <elfotografo007> you can find this information here: http://docs.mongodb.org/manual/core/replica-set-elections/

[16:09:36] <vonnieda> elfotografo007: Sorry, I was not clear. I am talking about connecting to a replset from a client. I am confused as to when/why a client will read from a secondary.

[16:10:48] <elfotografo007> By default you can't read from a secondary

[16:11:02] <elfotografo007> The only way you can read from a secondary is by specifying SlaveOk

[16:11:21] <vonnieda> elfotografo007: Correct, but that is not the behavior I am seeing. I am able to do reads if I specify two secondaries in the connection string.

[16:11:52] <elfotografo007> yes, because, one of them will become a primary

[16:12:23] <vonnieda> elfotografo007: Simply by connecting from a client? I'm not reconfiguring the replset, just changing the connection string on the client.

[16:12:34] <elfotografo007> that is done automatically

[16:13:41] <sjimi> Is it possible to select documents and filter them so you don't have duplicates for a certain key in those documents? If I did not explain meself good enough please say so. I'll try to elaborate.

[16:14:00] <ernetas> Hey guys.

[16:14:02] <sjimi> I think in the direction of an aggregation or somathing.

[16:14:12] <ernetas> Is there anything I need to know before deploying a second mongos to our cluster?

[16:14:21] <vonnieda> elfotografo007: I'm sorry, I don't follow. It sounds like you are saying it reconfigures the replication set, but I am not making any changes on the server - just changing the connection string on the client.

[16:14:32] <elfotografo007> ok

[16:14:36] <elfotografo007> let me explain it better

[16:14:50] <vonnieda> Thank you

[16:15:49] <elfotografo007> when connecting a client to a replicaSet, you could include all replica set members in the connection string, or you could just include one member of the replicaSet

[16:16:09] <elfotografo007> then, when the client connects to one member of the set, it asks for the entire list of members

[16:17:43] <vonnieda> elfotografo007: That makes sense, I follow you so far.

[16:17:52] <elfotografo007> ok

[16:18:31] <elfotografo007> with the list of serves, the client determines which server is the primary, and then connect to it

[16:19:02] <elfotografo007> if there is no primary available, it will connect to a secondary

[16:19:21] <elfotografo007> and you can't read from a secondary unless you've specified slaveOk

[16:20:46] <vonnieda> Okay, I understand and agree with all that. I am making a pastebin to show you where I am confused. One second.

[16:20:46] <talbott> hello mongoers

[16:21:11] <talbott> quick q, if i need to move the data files

[16:21:29] <talbott> if there anything else i need to do other than move the files and set the new dbpath

[16:21:32] <talbott> in mongo.conf?

[16:23:04] <talbott> i also made sure of the file perms

[16:23:13] <vonnieda> elfotografo007: This illustrates the problem I am having: http://pastebin.com/DRQAAwu7

[16:23:14] <talbott> but my db wont start with files in the new place

[16:23:55] <vonnieda> elfotografo007: If what you are saying is correct, in the last two cases it should read the config from the secondary and then connect to the primary, right?

[16:25:13] <elfotografo007> vonnieda: right. What client are you using?

[16:26:11] <vonnieda> npm mongodb 1.4.5

[16:29:31] <elfotografo007> vonnieda: what's the output of running db.isMaster() in the mongo console?

[16:33:27] <vonnieda> elfotografo007: Sent to you privately

[16:40:35] <sjimi> Hi. I would like some expertise.

[16:40:56] <saml> sjimi, i'm a webscalist

[16:41:23] <sjimi> Let me try to pose my question.

[16:41:36] <saml> okay i let you

[16:42:02] <sjimi> I have documents to match.

[16:42:21] <sjimi> (Need to think hard how to put it so sorry if slowly.)

[16:42:55] <sjimi> In these documents there are documents with a same nested key. e.g. ('data.id')

[16:43:31] <sjimi> I would like to filter out only one occurrence of these (for example the last one).

[16:45:19] <sjimi> So for example when you have 6 documents in total. With three {'data' : { 'id': 1, ... }, ...} like this. And three {'data' : { 'id': 2, ... }, ...} like this.

[16:45:49] <sjimi> I would like to filter out the doubles and get only the last document per id.

[16:46:15] <sjimi> @saml can you follow?

[16:46:28] <sjimi> saml: can you follow? =)

[16:46:35] <saml> do you use flowdock?

[16:46:41] <sjimi> Never heard of that.

[16:46:44] <saml> you got documents to match

[16:46:55] <saml> data.id

[16:46:57] <sjimi> I just need something query like; was thinking of some aggregate.

[16:47:16] <saml> db.docs.find({'data.id':{$exists:1}})

[16:47:56] <saml> db.docs.find().sort({'data.id':-1}).limit(1)

[16:47:58] <Jinsa> maybe by filtering: you order your data by DESC to get the last one and then you apply a array_unique to get only one of them ;)

[16:48:00] <saml> is that what you want?

[16:48:28] <Zelest> SQL deffo wins there

[16:48:44] <Jinsa> SQL can't do full text search

[16:48:53] <Jinsa> MongoDB wins!

[16:48:55] <sjimi> Hmm you are getting closer but I think that won't cut it.

[16:49:23] <Zelest> Jinsa, postgres can do full text search and it's awesome :P

[16:49:32] <Jinsa> haha

[16:49:33] <Jinsa> true

[16:49:40] <Zelest> <3 tsearch2

[16:50:28] <sjimi> saml: if you do what you say you only have one document remaining. Where I want to have two documents.

[16:50:43] <saml> .limit(2) sjimi ?

[16:50:55] <sjimi> The solution of my example should be two document: {'data' : { 'id': 1, ... }, ...} and {'data' : { 'id': 2, ... }, ...}

[16:51:00] <saml> i don't get your question actually

[16:51:09] <saml> why?

[16:51:15] <sjimi> data.id is not unique btw.

[16:51:26] <saml> copy paste example docs and docs you want to query?

[16:51:32] <sjimi> kk

[16:51:34] <saml> 10 example docs

[16:51:44] <saml> in gist.github.com or something

[16:51:53] <saml> is there mongodb fiddle.com ?

[16:57:51] <gh3go> Hi, I need a little info, mongodb is using any createdAt updateAt system field to keep tracking of create/update of an entry? I noticed that prob it has create date but not update, is this correct?

[16:58:11] <gh3go> Or updating a field will update the createdate with the timestamp of that very moment?

[17:09:58] <sjimi> Hello me friends, here I am again.

[17:10:03] <saml> hello

[17:10:04] <saml> asl

[17:10:27] <sjimi> This is my data https://gist.github.com/Sjimi/34563f0a707b5d53a92b.

[17:10:27] <saml> what you want is group by and return representative doc per group right?

[17:10:36] <sjimi> As you can see there are 12 documents.

[17:10:36] <sjimi> With six pairs of documents whom have the same 'data.id'.

[17:10:36] <sjimi> From these pairs I would like to receive one of each of them.

[17:10:47] <sjimi> Yep that is it saml.

[17:10:59] <saml> how do you select "representative doc" per 'data.id' ?

[17:11:06] <saml> if there are multiple docs with the same data.id ?

[17:11:24] <sjimi> Pick one random; or pick last one. I don't really care. (Last one preferably.)

[17:11:32] <sjimi> Just asking if it is possible with one Mongo query.

[17:11:36] <sjimi> Hard problem I guess.

[17:11:39] <saml> what is this for?

[17:11:55] <saml> i think you need full scan of collection

[17:12:01] <sjimi> Data Analysis.

[17:12:02] <saml> it'll be costly.. unless you cache

[17:12:08] <saml> so then you can just do a fullscan

[17:12:35] <sjimi> Hmm.

[17:12:50] <sjimi> If anybody else has any ideas.

[17:12:56] <sjimi> Going for food now.

[17:13:01] <saml> var groups = {}; db.docs.forEach(function(doc) { if (!groups[doc.data.id]) { groups[doc.data.id] = doc; })

[17:51:25] <joshua> Is there a good way to test out a field or two to see how ordinal the data is? I want to come up with some ideas for shard keys but would be cool to verify and compare somehow

[17:53:06] <cheeser> you could do a distinct query on it...

[17:54:36] <joshua> Maybe I should get a dump of the data and play with it in a sandbox. I could try some aggregation and group things and see the counts... kinda simulate chunks that way

[17:55:41] <joshua> The devs are clueless so we have unsharded data building up on different shards. heh

[19:27:19] <Jinsa> hi again ;)

[19:29:39] <Jinsa> just to be sure: the full text search feature (using command("text", {search:"..."})) will be discarded right?

[20:21:52] <djlee> Hi all,is 1 - 1.5 seconds a reasonable time for the mongo php driver to get a connection from the pool manager. I would have thought that it would be 0.2/0.3 seconds at the most?

[20:27:36] <tscanausa> djlee: depends on what you are doing

[20:39:24] <djlee> tscanausa: i think its MongoHQ, i believe the URI's they give you to use, aren't the most optimum (or at least not for PHP). I've found that if i use the URI's from their status page the connection is instant after first one, so im guessing theres some overhead in resolving the connection every time you connect using their recommended URI's. However i'm emailing them to confirm that im safe to use the other URI's

[20:39:24] <djlee> just in case they are volatile

[20:40:51] <tscanausa> if you are connecting really far away or network issues that could be the case but not nessesrily allways the case

[21:44:53] <jimpop> ll

[21:44:57] <jimpop> doh

[23:36:01] <justinsd> Hey guys.

[23:36:26] <justinsd> I am having an issue with mongo where each time I run it with a config script, the config script is ignored and it continued where it left on to use the old replica set data.

[23:36:30] <justinsd> So rs.conf() never changes.

[23:36:36] <justinsd> Even after I shut the machine down and up numerous times.

[23:36:43] <justinsd> Do you know how to get it to use a new configuration?

[23:36:50] <justinsd> I think it is storing the old configuration in the database somehow.

[23:41:22] <joannac> how are you loading the new configuration?

[23:46:21] <Kaiju> http://docs.mongodb.org/manual/reference/configuration-options/

[23:51:15] <Kaiju> I'm playing with map reduce for reporting and I've got the hang of the process. I'm wanting to do some complex processing with user records in both the mapping phase and reducing phase. Since the code is shipped into the cluster for processing I'm not sure how including functions into each stage of the mapreduce process works. Does anyone have a good resource or insight on this topic?

[23:51:37] <Kaiju> I'd like to avoid writing huge ugly if then else scripts

[23:51:54] <Kaiju> implementing in node if that helps

Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 18th of June, 2014