PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 4th of April, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[02:40:20] <kurushiyama> dddh__ You should jave a look at the github source.
[02:41:08] <kurushiyama> dddh_ https://www.mongodb.com/blog/post/introducing-mongorover-a-new-experimental-mongodb-driver-for-lua
[03:38:33] <Frenchiie> does doing an update on a document mean it has to find the match every time? is there a way to keep a reference to a document once you find it?
[03:38:46] <Frenchiie> so that an update to it doesn't mean having to go through the collection to find it
[03:41:05] <Frenchiie> not sure if i'm looking for indexing
[05:55:32] <dddh> kurushiyama: btw I used luamongo
[05:55:54] <dddh> do not like that there are no prebuilt packages for lua mongo clients in current distros ;(
[05:56:04] <dddh> playing with mongorover: "...ua/luarocks/share/lua/5.2/mongorover/MongoCollection.lua:100: No suitable servers found (`serverselectiontryonce` set): [connection error calling ismaster on '10.14.243.131:27017']"
[05:56:14] <dddh> probably something changed ;(
[07:10:03] <dddh> had to remove v6 addresses from docker /etc/hosts ;(
[10:22:24] <Ange7> Hello O/
[10:22:40] <Ange7> findAndModify() function don't have timeout option ? :(
[10:23:24] <Ange7> what is the difference between update and findAndModify ?
[10:39:35] <pamp> Hi
[10:39:56] <pamp> My mongodb instance fail with no disk space
[10:40:09] <pamp> Im using WiredTiger Storage engine
[10:40:50] <pamp> how can I fix this problem?
[10:42:10] <Ange7> have u free disk space ?
[10:44:41] <pamp> no :/
[10:46:18] <pamp> is there any option, to remove some data? without access database?
[10:56:12] <Derick> pamp: not from the database, try removing system log files or something like that first
[10:56:59] <pamp> log files are in another drive
[10:57:02] <pamp> :/
[10:57:49] <kurushiyama> pamp You use lvm?
[11:01:02] <pamp> what you mean with lvm?
[11:01:18] <pamp> are linux, but not vm
[11:01:36] <pamp> is a physical server
[11:02:21] <kurushiyama> pamp Logical Volume Management ;)
[11:03:02] <pamp> I'm not :/
[11:03:29] <kurushiyama> pamp Ok, let's check for the time frame first. Production data?
[11:03:44] <pamp> not, LAB
[11:04:13] <kurushiyama> So we are not super urgent, ok.
[11:06:37] <kurushiyama> pamp There are hacky ways, but you really need to know what you are doing.
[11:10:06] <pamp> can you help me with that
[11:44:22] <Ange7> is it possible with mongo to defer insert ?
[11:50:58] <kurushiyama> Ange7 Please clarify
[12:12:28] <Ange7> kurushiyama: i want insert doc but i don't want wait that insert is finished, i want my code continue,
[12:12:44] <Ange7> (sorry for my english, i'm french)
[12:13:19] <kurushiyama> Ange7 Well, I am German ;) Well, you have serval options here.
[12:13:44] <kurushiyama> Ange7 Option 1: reduce the write concern (maybe not ideal, depending on your use case).
[12:14:34] <kurushiyama> Ange7 Option 2: Use an async driver, which depends on your language if such is available
[12:14:50] <Ange7> kurushiyama: i have min 1500 insert / sec
[12:15:04] <Ange7> kurushiyama: actualy i'm using PHP Driver
[12:15:12] <kurushiyama> Ange7 You have or you plan to have?
[12:16:13] <kurushiyama> Ange7 Option 3 Spawn a thread (I have no clue how to do that in PHP nor wether this is even possible) and have the thread to the insert.
[12:16:24] <Ange7> kurushiyama: i have
[12:17:28] <kurushiyama> Ange7 So how does your backend look like? How many frontends do you have? Can / Do you use bulk inserts?
[12:18:12] <kurushiyama> Ange7 What is your writeConcern? What are the problems you experience?
[12:18:18] <Ange7> kurushiyama: found on stackoverflow http://stackoverflow.com/questions/19813713/are-mongo-php-write-operations-sync-or-async PHP Driver can write in async
[12:18:38] <kurushiyama> Ange7 There you have your solution... Sorta.
[12:19:19] <Ange7> Thank you :D
[12:20:00] <kurushiyama> Ange7 Wait
[12:20:05] <kurushiyama> That is serious
[12:20:36] <kurushiyama> Ange7 The reasoning there is that the writes will be async only with a writeConcern = 0
[12:21:06] <Ange7> yes ?
[12:21:44] <kurushiyama> This has serious implications.
[12:23:49] <kurushiyama> Ange7 A write concern of w=0 actually means that your writes will go unacknowledged.
[12:25:20] <kurushiyama> Ange7 Only network and socket exceptions will be returned. To make a long story short, you can not be sure that the write will be accepted. For example, duplicate key errors and alike are _not_ returned.
[12:27:28] <kurushiyama> Ange7 By all means, unless your data is extremely cheap (easy to reobtain), I would _not_ suggest lowering the write concern below 1. More often than not, I even suggest a higher write concern.
[12:31:12] <Ange7> i don't think it's a problem for my usage
[12:35:00] <kurushiyama> Ange7 Only you can decide that.
[12:37:24] <dddh__> pamp has quit ..
[12:43:24] <kurushiyama> dddh__ He pmed me. Will do that later. I just wonder how ops work, nowadays. Not using LVM for a dedicated partition of highly volatile and potentially growing data is... ...dangerous.
[15:23:31] <jetole> Hey guys. Is there a way for me to set multiple users in mongorc and have it auth with the credentials for the specific server depending on what server I am connecting to?
[15:42:48] <krion> hello
[15:43:09] <krion> i'm running mongo 2.6 and after a mass poweroff one of my configsrv doesn't start (ask for repair)
[15:43:31] <krion> since it's 2.6, my configsrv is not in a replicaset, should i just --repair ?
[16:19:20] <macwinne_> flawless upgrade to mongo 3.2.4 with WT with 200gb db.. love mongo
[16:20:19] <cheeser> w00t
[16:20:24] <macwinne_> saving 80% on storage. and more importantly went down from 6GB of ram for indexes to 1.2GB.. that's quite a unexpected, yet pleasant, benefit.. not sure why it's not talked about more
[16:20:37] <macwinne_> prefix compression is awesome
[16:37:47] <kurushiyama> macwinne_ There is a subtle thing to keep in mind, though. All those compressions make your instance rely on CPU more than you are used to with MMAPv1
[16:38:13] <cheeser> a CPU below 100% usage is a waste of money :D
[16:38:21] <macwinne_> lol
[16:39:04] <kurushiyama> Well, I'd say 85%, since you want to have time to scale.
[16:39:10] <macwinne_> cheeser: I was asking this on the weekend.. not sure if you know. Is there an easy way to tell if an index is actually being used without having access to the application code?
[16:39:57] <macwinne_> if I'm just mongo dba, and i don't have time to deal with devs who may be changing queries.. i want to keep an eye out on unused indexes that are unnecessarily eating memory
[16:40:11] <cheeser> do an explain on the query?
[16:40:20] <cheeser> oh! the other way.
[16:40:28] <cheeser> i don't know offhand.
[16:42:37] <StephenLynx> yeah, I never heard of a tool like that.
[16:43:06] <kurushiyama> There isnt. I answered him originally, and I took quite some time to research.
[16:43:17] <StephenLynx> you have 2 options: inspect the log and look for a pattern on queries and compare to the existing indexes
[16:43:27] <StephenLynx> or inspect the source code.
[16:43:58] <StephenLynx> if I were you, I would just take a look on the existing indexes, see the most resource intensive ones and see if they are actually needed.
[16:44:24] <StephenLynx> if its an index that is unnecessary but isn't a big deal, it might not be worthwhile to try and eliminate it.
[16:50:26] <kurushiyama> In general, an since we are repeating this discussion I have the urge to restate it for completeness: I think this is looking for a technical solution for a non-technical problem, namely lack of communication and proper database change management.
[16:50:44] <cheeser> agreed
[16:51:13] <cheeser> still, technical solutions can't be lazy and not bother to double check themselves or forget to send emails.
[16:51:35] <cheeser> most technical solutions are, after all, for non-technical problems.
[16:52:30] <kurushiyama> Are they? Computers solve problems we haven't had without them...
[16:52:43] <kurushiyama> ;)
[16:52:57] <cheeser> like how to manage indexes :)
[16:55:25] <kurushiyama> We once hit the RAM barrier with a cluster and asked dev to recheck their indices, since some of them looked very odd (indices over non-existing fields and such). They admitted they lost overview. We dropped all indices.
[16:55:48] <kurushiyama> Well, all we could drop ;)
[16:56:28] <kurushiyama> However, we were nice and did it in the morning.
[17:05:11] <saml> hey, what characters can't be used as key of object? dot right?
[17:05:15] <saml> dot is the only thing
[17:06:07] <cheeser> afaik
[17:06:15] <saml> Field names cannot contain dots (i.e. .) or null characters, and they must not start with a dollar sign (i.e. $). See How does MongoDB address SQL or Query injection? for an alternate approach.
[17:06:30] <saml> https://docs.mongodb.org/manual/reference/limits/#Restrictions-on-Field-Names hehehehhe
[17:06:56] <saml> so i'm writing newsletter signups. we have multiple newsletter accounts (different vendors)
[17:07:05] <cheeser> https://docs.mongodb.org/manual/faq/fundamentals/#how-does-mongodb-address-sql-or-query-injection
[17:07:26] <saml> so, it'll be {email:"unique email", lists: {vendorkey: {list name: true}}}
[17:07:51] <StephenLynx> wait waht
[17:07:57] <StephenLynx> u wot m8
[17:08:18] <saml> so i can do db.subscriptions.update({email:email}, {$set: {'lists.vendorkey23.some list name':true}}})
[17:08:28] <saml> for unsubscription, it'll be $unset
[17:08:39] <kurushiyama> saml buddy, that does not sound to reasonable to me.
[17:08:45] <saml> how so?
[17:08:53] <saml> how would you implement this?
[17:09:05] <StephenLynx> why not just have an array of _ids on each subscriber with the newsletters he is subscribed to?
[17:09:11] <saml> otherwise, i need to fetch a document by email. reconstruct it. and update the whole doc
[17:09:19] <kurushiyama> saml db.usbscriptions.insert({email:"foo@bar.com",list:"bar@baz.com"})
[17:09:33] <StephenLynx> >track unsubscriptions
[17:09:45] <StephenLynx> how_horrifying.jpg.png
[17:10:24] <saml> {email:'foo@bar.com', lists: [{vendor:'sailthru', name: 'wow list 1'}, {vendor: 'cheetahmail', name: 'yolo list 2'}]}
[17:10:33] <saml> if i do this, i can't just do single .update()
[17:10:41] <kurushiyama> db.susbscriptions.update({email:"foo@bar.com"}{$set:{unsub:true}})
[17:10:43] <StephenLynx> for what?
[17:10:47] <StephenLynx> a single update for what?
[17:10:47] <saml> i need to find({email:email}) and reconstruct lists array, right?
[17:10:59] <StephenLynx> you can run an aggregation.
[17:11:10] <saml> let's say foo@bar.com unsubscribed from {vendor:'sailthru', name: 'wow list 1'} but subscribed to a new list
[17:11:29] <StephenLynx> so you update his document for each operation.
[17:11:37] <StephenLynx> or one update for both
[17:11:38] <saml> oh so i can have multiple docs?
[17:11:40] <StephenLynx> if he can do both at once
[17:11:41] <StephenLynx> yes.
[17:11:47] <saml> waht's key?
[17:11:51] <StephenLynx> one for each subscriber on one collection
[17:12:03] <StephenLynx> and one for each newletter on another collection
[17:12:06] <StephenLynx> the key is _id
[17:12:08] <kurushiyama> Was just about to say that. And you usually need the emails for a given mailing list, like db.subscriptions.find({list:"bar@baz.com",unsub:false},{_id:0,email:1})
[17:12:09] <saml> one doc per {email, vendor, listname}
[17:12:30] <StephenLynx> yeah, but you won't have that many newsletters
[17:12:32] <saml> .find({email:email}) will give me all subscribed lists
[17:12:41] <StephenLynx> wait
[17:12:43] <StephenLynx> what
[17:12:51] <StephenLynx> on which collection?
[17:12:57] <saml> one collection
[17:13:01] <StephenLynx> which one
[17:13:10] <saml> db.subscriptions
[17:13:29] <StephenLynx> you will have to use a $contains
[17:13:38] <StephenLynx> on the array of subscribed newsletters.
[17:13:50] <saml> so in that case email is unique
[17:13:54] <saml> so, one doc per email
[17:13:56] <StephenLynx> yes.
[17:14:11] <saml> how would you pop and insert new elements to the array?
[17:14:22] <StephenLynx> $addToSet, $pull
[17:14:23] <saml> you first have to find by email, construct new array in the app, and update
[17:14:28] <StephenLynx> what
[17:14:32] <StephenLynx> construct new array?
[17:14:37] <saml> oh i see
[17:14:46] <saml> let me see
[17:17:09] <saml> db.docs.update({email:'a'}, {$addToSet: {lists: {vendor:'s',name:'foo bar'}}, $pull: {lists: {vendor:'a', name:'foo bar'}}})
[17:17:15] <saml> Cannot update 'lists' and 'lists' at the same time
[17:17:24] <saml> cannot addToSet and pull from same array at the same time
[17:17:42] <StephenLynx> run a bulkwrite
[17:17:45] <kurushiyama> saml hence, I suggested a flat model
[17:17:56] <StephenLynx> one with the pull, other with the push.
[17:18:06] <StephenLynx> people don't unsub often
[17:18:14] <saml> kurushiyama, in your flat model, what's _id? just mongo generated? just set index on email field?
[17:18:18] <StephenLynx> so I don't think its an issue having two operations
[17:18:39] <StephenLynx> yeah, _id is generated automatically by default.
[17:18:49] <StephenLynx> and hold on
[17:19:02] <StephenLynx> why you adding a complex object on that query?
[17:19:05] <StephenLynx> just add the _id
[17:19:14] <StephenLynx> otherwise you are duplicating data.
[17:19:17] <StephenLynx> without need.
[17:19:19] <saml> email is unique
[17:19:22] <StephenLynx> so as _id
[17:19:27] <kurushiyama> saml No. Store the subscriptions flat. db.subscriptions.insert({email:"subscriber@foo.com", list:"sender@bar.com"}) for example.
[17:20:20] <saml> i think both has merits
[17:20:31] <kurushiyama> StephenLynx Duplicating data is not necessarily evil. ;)
[17:20:38] <StephenLynx> I agree
[17:20:45] <StephenLynx> I just said it doesn't have to be on that case.
[17:20:55] <StephenLynx> he doesn't have to fetch all that aditional data frequently
[17:20:58] <saml> this is for a web page where a user with an email can subscribe/unsubscribe to a bunch of mailinglists by clicking checkboxes and submit
[17:21:17] <StephenLynx> which they won't do often.
[17:21:21] <kurushiyama> saml Been there, done that. For KISS (Keep It Simple and Scalable), a flat model is good enough.
[17:21:29] <saml> so, StephenLynx 's model is good so is kurushiyama
[17:21:32] <StephenLynx> people rarely unsub.
[17:21:54] <StephenLynx> and given how you won't have too many newsletters
[17:22:00] <saml> kurushiyama, in your flat model, you'd just delete the doc for each new unsubscription
[17:22:01] <kurushiyama> StephenLynx But they often are (and should be) fail-unsubbed.
[17:22:06] <StephenLynx> is not a burden to perform an aggregate to get the additiona ldata.
[17:22:40] <StephenLynx> hes suggesting an additional table to store the relations?
[17:23:00] <saml> ok i'll tahink about this having lunch
[17:23:03] <saml> what's for lunch
[17:23:09] <StephenLynx> and each subscription generating a document?
[17:35:50] <kurushiyama> http://pastebin.com/MaMbWqJZ
[17:36:01] <kurushiyama> This is roughly how I solved it.
[17:36:47] <kurushiyama> The thing is that the most common use case is "Get all non-suspended subscribers to a _given_ newsletter_"
[17:39:04] <kurushiyama> Instead of putting the list name into "subscriptions", it may be viable to even put the email in there. In that case, one query is saved, for very simple mailinglists. Personally, I needed some list metadata to process, hence I had one query for the list and one for the subscribers per incoming mail.
[17:48:59] <kurushiyama> saml Had some smoked turkey sandwiches with home made mayonaise, salad and fried onions.
[17:54:25] <saml> you good cook
[17:57:01] <kurushiyama> saml I try to be ;)
[18:00:39] <kurushiyama> saml Hope my solution helps you for your use case.
[18:04:25] <saml> let me chekc. thanks kurushiyama , the good cook and mongodb expert
[18:05:23] <kurushiyama> saml You are very kind, but both labels are slightly beyond the truth ;)
[19:30:45] <ing8> hi guys. I need a help on pymongo
[19:30:47] <ing8> I need to copy like 7kk rows from one mongo collection in one server to another mongo collection on another server. I cannot place result in the variable because I dont have enough mem and slices and limit/skip kinda slow at 10000 already. What is the best way to do this?
[19:43:14] <StephenLynx> get a cursor and consume document by document.
[21:33:55] <uuanton> test
[21:34:00] <uuanton> if I have just primary without secondaries and in connection string secondaryPreferred would it still be able to deliver if I have no secondaries ?
[21:36:13] <Derick> uuanton: you mean you have a one-node replicaset?
[21:36:22] <uuanton> yes
[21:36:32] <Derick> yeah, that should work
[21:36:37] <uuanton> im trying to minimize down time
[21:36:46] <cheeser> then you'd connect to the primary becuase you only *prefer* the secondary.
[21:37:03] <Derick> not sure how it would help with downtime?
[21:38:03] <uuanton> im trying to separate databases into two separate replicas
[21:38:20] <cheeser> that's not how replication works.
[21:38:37] <cheeser> unless you mean replica sets and not replica set members.
[21:38:53] <uuanton> two replica sets
[21:39:03] <cheeser> ok. carry on. :)
[21:39:10] <uuanton> detach secondary from replica drop local and initiate new ?
[21:40:17] <uuanton> while admins here may i ask a few other questions
[21:41:17] <uuanton> WARNING: Readahead for /data is set to 4096KB
[21:41:17] <uuanton> WARNING: Readahead for /data is set to 4096KB We suggest setting it to 256KB (512 sectors) or less
[21:42:15] <kurushiyama> uuanton The warning pretty much says it all, no?
[21:42:20] <uuanton> i was able to change it to smaller value using sudo blockdev --setra but after instance restart it goes back to 4096
[21:42:49] <uuanton> spend all days googling to change default readahead settings
[21:42:56] <uuanton> on centos7
[21:44:09] <kurushiyama> uuanton simply call it in rc.local?
[21:44:16] <uuanton> I did some systemd twiks,
[21:44:17] <uuanton> i tried both /etc/rc.local and /etc/udev/rules.d/85-ebs.rules
[21:44:38] <uuanton> nothing works
[21:44:44] <kurushiyama> have you put in the whole path into rc.local?
[21:44:53] <uuanton> sbin tried that
[21:45:17] <kurushiyama> uuanton "which blockdev"
[21:45:32] <uuanton> i set to 32
[21:45:35] <kurushiyama> uuanton Oh, and skip the dudo.
[21:45:40] <kurushiyama> sudo
[21:45:59] <kurushiyama> uuanton That is... small.
[21:46:14] <kurushiyama> uuanton The readahead serves a prupose.
[21:46:18] <uuanton> people say readahead doesn't matter is ssd drive
[21:46:39] <kurushiyama> uuanton Only partially true. RAM is still orders of magnitude faster
[21:47:01] <kurushiyama> uuanton It's impact is not as extreme as with HDDs would be more precise.
[21:47:36] <uuanton> is anyone sharing their mongod.service configs here ?
[21:48:06] <uuanton> i wrote my but i think its poor
[21:48:27] <kurushiyama> uuanton My is exactly like it was installed by the package.
[21:48:48] <uuanton> it never install mongod.service
[21:49:02] <kurushiyama> uuanton Never had the need to fiddle iwth the initialization ;)
[21:50:00] <uuanton> how about limitfsize limitcpu settings and etc.
[21:50:18] <kurushiyama> uuanton configuring limits.conf, kernel command line and rc.local
[21:50:44] <uuanton> is it centos 7 ?
[21:50:59] <uuanton> because on centos7 seems like rc.local doesn't work
[21:51:35] <kurushiyama> uuanton is it executable?
[21:51:47] <uuanton> can't find any useful mongod.service config on internet
[21:53:43] <uuanton> i've read somewhere that rc.local backward compatible and its better to avoid it on centos 7.
[21:55:28] <uuanton> I would be glad if someone with a feedback on my script
[21:55:30] <uuanton> http://pastebin.com/tKFP2cLS
[21:56:29] <kurushiyama> uuanton Whut? I'd like to see that source. rc.local is a mere shell script executed with root permissions.
[21:57:47] <kurushiyama> uuanton You might want to have a look at https://docs.puppetlabs.com/puppet/latest/reference/man/apply.html
[21:58:26] <kurushiyama> uuanton As for your script: you _have_ to check the return values to see if something went wrong.
[22:01:25] <uuanton> i think i make rc.local executable
[22:01:39] <uuanton> i mean i think i missed that part
[23:15:10] <Echo6> Hey, can you guys help me get a count of the number of records created from a start date to an end date?
[23:15:16] <Echo6> Very new to mongo.
[23:15:49] <Echo6> I know the basic command of db.mycollectionname.count()
[23:48:05] <MacWinner> i want to eventually move to a mongo sharded cluster.. i currently have a 3-node replicaset with about 4 application servers talking to it. Should I spin up mongos instances and have my apps talk to mongos as the intermediary before moving to the sharded cluster?
[23:48:39] <MacWinner> i want to minimize the number of changes I do at the time of actually sharding.. so i thought getting the application architecture to talk to mongos completed first may be prudent
[23:49:34] <MacWinner> basically questin is does mongos require an actual sharded cluster, or can it talk to a simple replicaset and then automatically talk to a sharded cluster once it's configured